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0 NANBV diagnostics and vaccines. 

@ A new virus. Hepatitis C virus (HCV), which has proven to be the major etiologic agent of blood-borne 
NANBH, was discovered by Applicant. The initial work on this virus, which includes a partial genomic sequence 
of the prototype HCV isolate, is described in EPO Pub. No. 318,216. and PCT Pub. No. WO/89/04669. The 
present invention, which in part is based on new HCV sequences and polypeptides which are not disclosed in 
the above-cited publications, includes the application of these new sequences and polypeptides in immunoas- 
says, probe diagnostics. anti-HCV antibody production, PCR technology, and recombinant DNA technology. 
Included within the invention also are novel immunogenic polypeptides encoded within clones containing HCV 
cDNA, novel methods for purifying an immunogenic HCV polypeptide, and antisense polynucleotides derived 
from HCV cDNA. 
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NANBV DIAGNOSTICS AND VACCINES 



Technical Reld 

The invention relates to materials and methodologies for managing the spread of non-A, non-B hepatitis 
virus (NANBV) infection. More specifically, It relates to polynucleotides derived from the genome of an 
s etiologic agent of NANBH, hepatitis C virus (HCV), to polypeptides encoded therein, and to antibodies 
directed to the polypeptides. These reagents are useful as screening agents for HCV and its infection, and 
as protective agents against the disease. 
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Background Art 

_ i D . /mamrw\ is a transmissible disease or family of diseases that are believed to be 

, !*°? JZl ^SS^^STtan- of vlral-associated liver diseases. Including that 
vlraknduced. and that are disnngutsnaoie ™ m hepatitis B virus (HBV). and delta 

EB^ NANBH ( was «rS identified in transfused individuals. Transmission from man to clumpanzee and 
Sa?'paZe I" Srnpanzees provided evidence that NANBH is due to a transm.ss.ble .nfectous agent or 

dnrioo. co««ATm«.«»»clTOj<»»i». ™~ 01 Me m Mi l»°v*! 

So d „ V 7m?Z^t£E£ rfoVetoT ^P.U of JsCuole ar* palate 
infection of H ^ v J^™J " " tQ the inte g rati on of HBV DNA into the genome of l.ver cells. In 
antigens associated wth ^ ^J 0 ^ by more than one infectious agent, as well as the 

^^^•l^^^tTi example the reviews by Prince (1983). 
sequences of new HCV isolates. 
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Disclosure of the Invention 



The present Invention is based, in part, on new HCV sequences and polypeptides that are not disclosed 
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in EPO Pub. No. 318,216, or in PCT Pub. No. WO 89/04669. Included within the Invention Is the application 
of these new sequences and polypeptides in. Inter alia, immunodiagnostics, probe diagnostics, antl-HCV 
antibody production, PCR technology and recombinant DNA technology. Included within the invention, also, 
are new immunoassays based upon the immunogenicity of HCV polypeptides disclosed herein. The new 

s subject matter claimed herein, while developed using techniques described in, for example, EPO Pub. No. 
318,216, has a priority date which antecedes that publication, or any counterpart thereof. Thus, the invention 
provides novel compositions and methods useful for screening samples for HCV antigens and antibodies, 
and useful for treatment of HCV Infections. 

Accordingly, one aspect of the invention is a recombinant polynucleotide comprising a sequence 

w derived from HCV cDNA, wherein the HCV cDNA is in clone 131, or clone 26], or clone 59a, or clone 84a, or 
clone CAlS6e, or clone 167b, or clone pi 14a, or done CA216a, or clone CA290a, or clone ag30a. or clone 
205a, or clone I8g, or clone 16]h, or wherein the HCV cDNA is of a sequence indicated by nucleotide 
numbers -319 to 1348 or 8659 to 8866 In Fig. 17. 

Another aspect of the invention is a purified polypeptide comprising an epitope encoded within HCV 

is cDNA wherein the HCV cDNA Is of a sequence Indicated by nucleotide numbers -319 to 1348 or 8659 to 
8866 in Fig. 17, 

Yet another aspect of the invention is an immunogenic polypeptide produced by a cell transformed with 
a recombinant expression vector comprising an ORF of DNA derived from HCV cDNA, wherein the HCV 
cDNA is comprised of a sequence derived from the HCV cDNA sequence in clone CA279a, or clone CA74a, 
20 or clone 131, or clone CA290a ( or clone 33C or clone 40b, or clone 33b t or clone 25c, or done 14c, or clone 
8f, or clone 33f, or done 33g, or clone 39c. or done I5e, and wherein the ORF is operably linked to a 
control sequence compatible with a desired host. 

Another aspect of the invention is a peptide comprising an HCV epitope, wherein the peptide is of the 
formula 
25 AA x -AA y , 

wherein x and y designate amino acid numbers shown in Fig. 17, and wherein the peptide is selected from 
the group consisting Of AA1-AA25, AA1-AA50, AA1-AA84, AA9-AA177. AA1-AA10, AA5-AA20, AA20-AA25, 
AA35-AA45. AA50-AA100. AA40-AA90, AA45-AA65, AA65-AA75, AA80-90, AA99-AA120, AA95-M110, 
AA105-AA120, AA100-AA150, AA150-AA200, AA155-AA170, AA190-AA210. AA200-AA250, AA220-AA240. 

30 AA245-AA265, AA250-AA300. AA290-AA330, AA290-3Q5, AA30O-AA350. AA310-AA330, AA350-AA400, 
AA380-AA395, AA405-AA495, AA400-AA450, AA405-AA415, AA415-AA425. AA425-AA435, AA437-AA582. 
AA450-AA500, AA440-AA460, AA460-AA470, AA475-AA495, AA500-AA550, AA511-AA690. AA515-AA550. 
AA550-AA600, AA55C-AA625, AA575-AA605, AA585-AA600, AA600-AA650, AA6G0-AA625, AA635-AA665, 
AA65OAA700. AA645-AA680, AA700-AA750, AA700-AA725. AA700-AA750, AA725-AA775. AA77OAA790, 

as AA750-AA800, AA800-AA815, AA825-AA850. AA850-AA875, AA800-AA850, AA920-AA990, AA85Q-AA900, 
AA920-AA945, AA940-AA965, AA970-AA990. AA950-AA1000, AA1000-AA106O. AA1000-AA1025. AA1000- 
AA1050. AA1025-AA1040, AA1040-AA1055, AA1075-AA1175, AA1050-AA1200, AA1070-AAH00. AA1100- 
AA1130. AA1140-AA1165, AA1 1 92-AA1 457, AA1195-AA1250, AA1200-AA1225, AA1 225-AA1 250, AA1250- 
AA1300, AA1260-AA1310. AA1260-AA128Q, AA1266-AA1428, AA1300-AA1350. AA1290-AA1310, AA1310- 

40 AA1340, AA1345-AA1405, AA1345-AA1365. AA1350-AA1400, AA1 365-AA1 380, AA1 380-AA1 405, AA140O- 
AA1450. AA1450-AA1500, AA1460-AA1475, AA1475-AA1515. AA1475-AA1500. AA1 500- AA1 550, AA1500- 
AA1515, AA1515-AA1550. AA1550-AA1600, AA1545-AA1560. AA1569-AA1931, AA1570-M159Q, AA1595- 
AA1610, AA1590-M1650. AA1610-AA1645, AA165D-AA1690. AA1685-AA1770, AA1689-AA1805, AA1690- 
AA1720, AA1694-AA1735. AA1720-AA1745, AA1 745- AA 1770, AA1750-M1800, AA1775-AA1810, AA1795- 

<s AA1850, AA1850-AA1900. AA1 900- AA 1950, AA1900-AA1920, AA1916-AA2021. AA1920-AA1940, AA1949- 
AA2124, AA1950-AA2000, AA1950-AA1985, AA198C-AA2000. AA20Q0-AA2050, AA2005-AA2025. AA2020- 
AA2045, AA2045-AA2100, AA2045-AA2070, AA2054-AA2223, AA2070-AA2100, AA2100-AA21 50. AA2150- 
AA2200, AA2200-AA2250, AA2200-AA2325. AA2250-AA2330, AA2255-AA2270, AA2265-AA2280, AA2280- 
AA2290, AA2287-AA2385, AA23QO-AA2350. AA2290-AA231 0, AA231 0-AA2330. AA2330-AA2350, AA2350- 

50 AA2400, AA2348-AA2464, AA2345-AA2415, AA2345-AA2375, AA2370-AA2410. AA2371 -AA2502. AA2400- 
AA2450, AA2400-AA2425, AA2415-AA2450, AA2445-AA2500, AA2445-AA2475, AA2470-AA2490. AA2500- 
AA2550. AA2505-AA2540. AA2535-AA2560, AA2550-AA260G, AA2560-AA2580, AA2600-AA2650. AA2605- 
AA2620. AA2620-AA2650, AA2640-AA2660, AA2650-AA2700, AA2655-AA2670. AA2670-AA2700, AA2700- 
AA2750. AA2740-AA2760. AA275OAA280Q, AA2755-AA2780, AA2780-AA2830, AA2785-AA2810. AA2796- 

55 AA2886, AA2810AA2825, AA2800-AA2850, AA2850-AA2900, AA2850-AA2865. AA2885-AA2905, AA2900- 
AA2950, AA2910-AA2930. AA2925-AA2950, AA2945-end(C' terminal). 

Still another aspect of the invention is a monoclonal antibody directed against an epitope encoded In 
HCV cONA, wherein the HCV cDNA is of a sequence indicated by nucleotide numbers -319 to 1348 or 
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8659 to 8866 In Fig. 17, or is the sequence present in clone 13i, or clone 26i, or done 59a, or clone 84a, or 
clone CA156e, or clone 167b. or clone pH4a, or clone CA216a. or clone CA290a, or clone ag30a, or clone 
205a. or clone 18g, or clone 16jh. 

Yet another aspect of the Invention is a preparation of purified polyclonal antibodies directed against a 

5 polypeptide comprised of an epitope encoded within HCV cONA, wherein the HCV cDNA is of a sequence 
indicated by nucleotide numbers -319 to 1348 or 8669 to 8866 in Fig. 17, or is the sequence present in In 
clone 131. or clone 26j, or done 59a, or clone 84a, or clone CAl56e. or clone 167b, or clone pH4a, or clone 
CA2l6a, or clone CA290a, or clone ag30a, or clone 205a, or clone I8g, or clone I6jh. 

Still another aspect of the invention is a polynucleotide probe for HCV, wherein the probe is comprised 

w of an HCV sequence derived from an HCV cDNA sequence indicated by nucleotide numbers -319 to 1348 
or 8659 to 8866 in Fig. 17, or from the complement of the HCV cDNA sequence. 

Yet another aspect of the invention is a kit for analyzing samples for the presence of polynucleotides 
from HCV comprising a polynucleotide probe containing a nucleotide sequence of about 8 or more 
nucleotides, wherein the nucleotide sequence is derived from HCV cONA which is of a sequence indicated 

fs by nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17, wherein the polynucleotide probe is in a 
suitable container. 

Another aspect of the invention is a kit for analyzing samples for the presence of an HCV antigen 

comprising an antibody which reacts immunologically with an HCV antigen, wherein the antigen contains an 

epitope encoded within HCV cONA which is of a sequence indicated by nucleotide numbers -319 to 1348 or 
20 8659 to 8866 in Fig. 17, or wherein the HCV cDNA Is in clone 13i. or clone 26/, or clone 59a, or clone 84a. 

or clone CA156e, or clone 167b, or clone pi 1 4a, or clone CA2l6a. or clone CA290a, or clone ag30a. or 

clone 205a, or clone I8g ( or clone i6jh. 

Yet another aspect of the invention is a kit for analyzing samples for the presence of an HCV antibody 

comprising an antigenic polypeptide containing an HCV epitope encoded within HCV cDNA which is of a 
2s sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17, or is in clone 13I. or 

clone 26], or clone 59a. or clone 84a, or clone CA156e. or clone 167b, or clone pi14a, or clone CA216a, or 

clone CA290a, or clone ag30a, or clone 205a. or clone 18g. or clone 16jh. 

Another aspect of the Invention is a kit for analyzing samples for the presence of an HCV antibody 

comprising an antigenic polypeptide expressed from HCV cONA in clone CA279a, or clone CA74a, or clone 
30 131. or clone CA290a. or clone 33C or clone 40b. or clone 33b. or clone 25c, or clone 14c, or clone 8f, or 

clone 33f, or clone 33g, or clone 39c, or clone I5e, wherein the antigenic polypeptide is present in a 

suitable container. 

Still another aspect of the invention is a method for detecting HCV nucleic acids in a sample 
comprising: 

35 (a) reacting nucleic acids of the sample with a polynucleotide probe for HCV, wherein the probe is 
comprised of an HCV sequence derived from an HCV cDNA sequence is of a sequence indicated by 
nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17, and wherein the reacting is under conditions 
which allow the formation of a polynucleotide duplex between the probe and the HCV nucleic acid from the 
sample; and (b) detecting a polynucleotide duplex which contains the probe, formed in step (a). 

40 Yet another aspect of the invention is an immunoassay for detecting an HCV antigen comprising: 

(a) incubating a sample suspected of containing an HCV antigen with an antibody directed against an HCV 
epitope encoded in HCV cDNA, wherein the HCV cDNA is of a sequence indicated by nucleotide numbers 
-319 to 1348 or 8659 to 8866 in Fig. 17, or is the sequence present in clone 13i. or clone 26J, or clone 59a, 
or clone 84a, or clone CA156e. or clone 167b, or clone pit 4a, or clone CA216a, or clone CA290a, or clone 

46 ag30a, or clone 205a. or clone I8g, or clone 16jh, and wherein the incubating is under conditions which 
allow formation of an antigen-antibody complex; and (b) detecting an antibody-antigen complex formed in 
step (a) which contains the antibody. 

Still another aspect of the invention Is an immunoassay for detecting antibodies directed against an 
HCV antigen comprising: 

so (a) Incubating a sample suspected of containing anti-HCV antibodies with an antigen polypeptide containing 
an epitope encoded in HCV cONA, wherein the HCV cONA is of a sequence indicated by nucleotide 
numbers -319 to 1348 or 8659 to 8866 in Fig. 17, or is the sequence present in clone I3i. or clone 26j, or 
clone 59a, or clone 84a, or clone CAi56e, or clone 167b, or clone pil4a. or clone CA216a. or clone 
CA290a, or clone ag30a, or clone 205a, or clone 18g. or clone 16jh, and wherein the incubating is under 

ss conditions which allow formation of an antigen-antibody complex; and detecting an antibody-antigen 
complex formed in step (a) which contains the antigen polypeptide. 

Another aspect of the invention is a vaccine for treatment of HCV infection comprising an immunogenic 
polypeptide containing an HCV epitope encoded in HCV cDNA, wherein the HCV cONA is of a sequence 
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indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17 or is the sequence present in 
clone 131. or clone 26j. or clone 59a. or clone 84a. or clone CA156e. or clone 167b. or clone P ,14* i or clone 
CA216a or clone CA290a, or clone ag30a, or clone 205a. or clone 18g. or clone 16|h. and wherein the 
immunogenic polypeptide is present in a pharmacologically effective dose in a pharmaceutical^ acceptable 

^Yefanother aspect of the invention is a method for producing antibodies to HCV comprising administer- 
ing to an individual an isolated immunogenic polyeptide containing an HCV ^P^^ 6 ^,"^^' 
wherein the HCV cDNA is of a sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in 
Re 17 or is of the sequence present in clone CA279a. or clone CA74a. or clone 13i. or clone CA290a. or 
clone 33C or clone 40b. or clone 33b, or clone 25c. or clone 14c. or clone 8f. or clone 33f, or clone 33g, or 
clone 39c or clone 15e. and wherein the immunogenic polypeptide is present in a pharmacologically 

~22£2SS£ KSESSSS*-— — - *» «■«. — « 

" CV V« SiTiSrS M £«*o * • »r PWrtn, p* M Ufa. polypeptide 0,00-3 

comprising: 

(a) providing a crude cell lysate containing polypeptide C100-3, 

(b) treating the crude cell lysate with an amount of acetone which causes the polypeptide to 
precipitate, g and solubllizfng the precipitated 

20 (c) isolatin material, 

(d) isolating the C10Q-3 polypeptide by anion exchange chromatography, and 

(e) further isolating the C10O3 polypeptide of step (d) by gel filtration. 
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25 Brief Description of the Drawings 

Rg 1 shows the sequence of the HCV cONA in clone 12f. and the amino acids encoded therein. 
Fig 2 shows the HCV cDNA sequence in clone k9-1. and the amino acids encoded therein. 
Fiq 3 shows the sequence of clone 15e. and the amino acids encoded therein. 
so Rg. 4 shows the nucleotide sequence of HCV cDNA in clone 13». the ammo acids encoded therein, 

and the sequences which overlap with clone 12f. ... 

Rg 5 shows the nucleotide sequence of HCV cDNA in clone 26,, the ammo acids encoded therein, 
and the sequences which overlap clone 1 3i. 

Rg 6 shows the nucleotide sequence of HCV cDNA in clone CA59a, the amtno acids encoded 
as therein, and the sequences which overlap with clones 26j and K9-1. f . , . . 

pj? shows the nucleotide sequence of HCV cDNA in clone CA84a, the amino acids encoded 
therein and the sequences which overlap with clone CA59a. 

Rg. 8 shows the nucleotide sequence of HCV cDNA in clone CA156e. the amtno acids encoded 
therein and the sequences which overlap with CA84a. 
40 m 9 shows the nucleotide sequence of HCV cDNA in clone CA167b. the amino acids encoded 

therein, and the sequences which overlap CAl56e. 

Rg. 10 shows the nucleotide sequence of HCV cDNA in clone CA216a t the amino acids encoded 
therein and the overlap with clone CA1 67b. 

Rg. 11 shows the nucleotide sequence of HCV cDNA in clone CA290* the amino adds encoded 
4s therein, and the overlap with clone CA21 6a. 

Rg. 12 shows the nucleotide sequence of HCV cONA in clone ag30a and the overlap with clone 

CA29 °fig. 13 shows tho nucleotide sequence of HCV cDNA in clone CA2053, and the overlap with the HCV 
cDNA sequence in clone CA290a. 

Fig 14 shows the nucleotide sequence of HCV cDNA in clone I8g. and the overlap wtth the HCV 

cONA sequence In clone ag30a. 

Fig. 15 shows the nucleotide sequence of HCV cONA In clone I6jh, the amino aads encoded therein, 
and the overlap of nucleotides with the HCV cONA sequence in clone I5e 

Fin 16 shows the ORF of HCV cDNA derived from clones piMa. CA167b. CAl56e, CA^a, CA&sa, 
K9-1. 12f 14i. 11b. 7f. 7e. 8h. 33c. 40b. 37b. 35. 36. 81. 32. 33b. 25c. 14c. 8f. 33f. 33g. 39c. 35f. 19g. 26g. 

15 Rg 17 shows the sense strand of the compiled HCV cONA sequence derived from the above- 
described' clones and the compiled HCV cONA sequence published in EPO Pub. No. 318.216. The clones 
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Ses tSUL the first and last nucleotides of the published sequence Also shown in the figure . the 

- ino ^rr a ^ - * — mappins 

StUdieS F,a 19 shows the hydrophobics profiles of polyproteins encoded in HCV and in West Nile virus 

Sg 20 is a tracing of tm hydrophiliclty/hydrophobfcity profile and of the antigenic Index of the 

^Rg.* SrSe conserved co-Hnear peptides in HCV and FMviruses. 

Modes for Carrying Out the Invention 
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Th« t S rm "hepatitis C virus" has been reserved by workers in the field for an heretofore unknown 

HCV The information provided herein, aithough derived from the prototype strain or Isolate of HW. 
* , h! ?J£™d CDC/HCV1 (also called HCV1), is sufficient to allow a viral taxonom.st to identify 

SIX JUSSSSL. "cng the ou'ter surlace of the virion envelope ana projections that are about 5- 
10 arrSS^ obtain variations at *e amino acid and nucleic 
JK£S with »e prototype isolate. ™^^^^tZ^ 
about 40%) homology in the total ^JLT^J ^ ^ defined ^ HCV strai ns according to 
found that other less homologous ^«WJJ These woud nuc , e otides. 

lS*r to mat of HCV1, and the presence of co-Bnear pept.de sequences that are conserved 
with HCV1 In addition, the genome would be a positive-stranded RNA. 

encodes at ^ast 0 ne epitope which is immunologically identifiable with an epitope m the HCV 
nann™ fn^Sc thecDNAs described herein are derived: preferably the epitope is contained an am.no 
genome from which Br cuww oeac corn pared to other known Flaviviruses. 

TSS to the above, the Mowing parameters of nucleic acid homology and ammo ac.d homology 
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are applicable. r^v™«Ts TZ 

and isolates are evolutional y rrtMc I. RJ s ^P^ m 6Q% or ater , and ^ ^ 

'^JS^^J^^'EZ * »» --ponding contiguous sequences of at 
probably about 80% or gr eater ana «v *° HCV stfain genomic seque nce and 

least about 13 nucleolus. The TO ^^"^"~JJ^f^f a J es ^ ovm in ^ art. For example, they can 
HCV, and the HCV cDNA sequence® J^^TSi IcS foJTsteble duplexes between homologous 

rr^ffiS 1 " r ed r 9esMon ^ " 9,e 

^SeSu^ HCV strains or 

Because of the evolutionary relations!,,* of the ^ or isolates are 

is o.ates are identifiable by their homology * * J*2J* *J ^TboS' 70% homologous, and even 
expected to be more than about 40% homologous. ^ ab ™ ™ m0 ° re ^ ^ M% 

m ore P^'y--^^^ 

SS^JSS S tttXEZtt — J —nee encoded therein can 
be determined, and the corresponding regions compared. seflU ence refers to a polynucleotide 

As used herein, a polynucleotide "derived from" < . design ^^Vnudeo^ preferably at 
sequence which is comprised of a sequence c ^PP— ^To^t^eu more preferably at 
least about 8 nucleotides, more preferably at least a bout ^« designated nucleotide sequence, 
least about 15-20 nucleotides corresponding £ jjjgwn ^ s SJJt equem ». Preferably, the 
2S -corresponding- means homologous to or ^SLThTm^ homo.ogous to or complementary to a 
sequence of the region fro, . wheh Hot a sequels unique to the HCV genome 
sequence which is unique to an HCV genome, vvnetner » h sequence can be 

can be determined by techniques known 1 1 those <* eta to the art ^ P rtfected 
compared to sequences in databanks, e.g.. Gan f * ^Sld "to the known sequences of other viral 

polynucleotides formed by hybridization car ' "e detem ned y areas jn duplex poly . 

digestion with a nuclease such as S that 8 f*^' y TfT „ "d. include but OT not limited to. 
nucleotides. Regions <™ regions, 
for example, regions encoding specify epitopes, as wen as r»n- nucleotide sequence shown, 

„ The derived polynucleotide is not nearly Pj^pTSlS ^thesis or replication or 
but may be generated m any man ner incfodlng ^ example c e y fe ^ of ^ 

reverse transcription or J ^JS^t?^*^^ with an intended ° SS - 

designated sequence may be modified in ways known mvmm desianated nuC | eic acid sequence 

ilmilarly. a polypeptide or amino acid sequence derived from a encoded In the 

56 to those of skill in the art. intends a oolvnucleotlde of genomic. cONA, 
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other than that to which it is linked in ^'^.^^.^JnucWkla. of any tength. either 
The term "polynucleot.de" as used herein f ers * " im sMMre 0 f the molecule, 

ribonucleotides or singie stranded RNA. It 

Thus, this term includes are known in the art, mediation. 

a ,so includes known types tf ^^^^X^curring nucleotides with an analog, intemucleotide 
-caps", substitution of one .or nM of the ljnk (6 . g ., methyl phosphonates, 

modifications such as, for example those with ""cna gec «y inkages (e . g .. ph osphorothioates. 
phosphotriesters. phosphoamidates. ^^^2^% fo e^p e proteins (including for 

well as unmodified forms of the polynucleotade. fragment thereof which is 

The term "purified viral polynudeohde refers *> « HW genome^ 0 ^ more 

essentially free. i.e.. contains less than about 50% _pre^ ™ natura |.y associated, 

preferably less than about 90% of polypeptides with ^J^r^ £ toSTln the art. and include for 
Techniques for purifying viral and separation of the 

essentially free. i.e.. contains lass j£, ^ viral polypeptide is naturally 

preferably less than about 90%. of °**r™££ h w art. and examples of these 
associated. Techniques for purifying viral gSj^'JZSoJde" refers to an HCV genome or 
25 techniques are discussed infra. The tem. I ^^^Si 20%. Preferably less than about 
fragment thereof which is 7^^*^55- with which the viral polynucleotide is 

mentation according to density. cultures", and other such terms 

"Recombinant host cells", "host cells" <W s cjMnjJ ^ ojla* ^ {q ^ ^ 

denoting microorganisms or higher eu ^ o C r ^0^^ ^^ oSier transfer DNA. and include the 
can be, or have been, used as rec.p.ents for nmnbran t vector gj 

- «n iWiS rc2jrsa"~ ~ - «- dna 

clmpC^ a cosmid . etc . « 

^ - ' ~ - -~ ° f 

40 te T4S? is a replicon in which another polynucleotide segment is attached, so as to bring about the 
replication and/or expression of the attached segment. necessar y to effect the expression of 

P "Control sequence" refers to P-»»Jf*^ depend " S UP0 " 

coding sequences to which they are l.gated. The ««™«**^ incMe promoter, ribosomal binding 

sa x™ r fir— - - - - — 

««JJ1*«»2 . ^ o, ap*™*— se,„e»c «t** encodes . pMvwft* 
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nmumn f^'^^SL. relere B «, ~m» of ephoprtS oM polypeptide*) «Noo »o 
-ImmooolooKally *o»^"«»« "J™ »*" ' tonologioal idertity rpay be 

£zr»J2-rs5 SKSisS'-i.— -*> — - — - 

: Trj,rr pCrr y ^ — - - 

The M «*£T frilTt S^nttn S polypes ™ 

, ploorSid. « olt.m*oly. may b. MgnM Mo Urn r^™"* 

SSJS* — ■ — *■ » — 1 i! — — ' " 

to H=*«d». «hk* «» - TO S"T,oS » olompil t S MM**. body 

myelomas. ^reparation of HCV which has been Isolated from the cellular 

50 diSC Tr d ^ -HCV particles" as used herein include entire virion as well as particles which are 
intermedial virion CSn. HCV particles general* have one or more HCV prcte.ns assoc.ated with 

- "proPe" refers to a K^I^iJSlSES^ 
* S5J5 ff^~JSr^SS-! to s^nce. used 
t0 TusTdK^SmCt region" refers to a region of the nude, acid whfch is to be amplified 
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"TustThereln. the term "viral RNA", which includes HCV RNA. rete* to RNA from the virai genome, 
fragments thereof, transcripts thereof, and mutant ^^f^^Td isolated from an individual 
As used herein, a "biological sample" ^ to a » m ^° fla J | ym ph flu d! the externa, sections of 
including but not limited to. for example, plasm a serum spi nal ^Mj^J °^ and 
the skin, respiratory, intestinal, and ^2213^15^ to condoned medium resulting 

components). 

tl. Description of the Invention 

" The practiTe of the present invention 

o, mo.ecu.ar biolcgy. '^^^S^CSSSt. M * Sambrcok. MOLECULAR 
Such ^ h *"« ™ XffiTSS SnINQ. VOLUMES I AND II (D.N Glover ed. 1985): 

CLONING; A LABORATORY NIANUAL (19S2). W • HYBRIDIZATION (B.D. Hames & 

OLIGONUCLEOTUH !»™5^5r*S S» Hemes * S.J. Higgins eds. 1984); 
S.J. Higgins eds. 1984); TRANSCRIPTION nwu '™ ' CELLS AND ENZ YMES (IRL Press. 

ANIMAL CELL CULTURE pi Freshney ed 1MQ. ™^ZEO f LLS^ METHODS IN 

1986); B. Pert*. A PRACTICAL GUIDE TO I ^J^^ FOR MAMMAUAN CELLS (J.H. 

ENZYMOLOGY (Academic ^^J^^SSL^S^ <" Enzymology Vol 154 and Vol 
Miller and M.P. Calos eds. 1987, ^^^'^2^6 Walker, eds. (1987), IMMUNOCHEMICAL 
188 (Wu rtQgm*^^ u ; ndon) : scopes. (1987), PROTEIN 

pSc^^^^ 

by reference orocesS es of the present invention are made possible by the provision of a 

The useful materials and Passes oi we i pre* . sequen ces. These 

For example, from the sequences it is possible to syrtheste iD«*J» rs 

.arger. which are useful as £3^^S*"« <" the vlruS 

donated blood, sera of subjects suspected ***** *• virus, or ce„ cut* ^ 

,s replicating. In addition, the J^JS of anybodies raised during HCV 

« polypeptides which are useful as diagnostic reagents or the presenc ^ 
infection. Antibodies to purified polypeptides derived from the > cDNAs may 

antigens in biological samples. ^£^^21712. the immunogenic 
NANBH, and in tissue culture systems be.ng us* Ifj _HCV repletion Moreo » ^ 

,0rt rSn°X novel cDNA sequences described ^^^^^^"iSS 
genome. Polynucleotide probes and primers ^ from Ihese «j^nDiyn^ * cDNA 

5 rSrSTStSS ^JZ> of HCV appears to be RNA comprised primari.y of a fcrge open 

rea r a? ss*^ -» — and the 
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immunogenic polypeptides described herein, as weft as antibodies directed against these polypeptides are 
also useful In the isolation and identification of the blood-borne NABV (BB-NANBV) agent(s). For example, 
antibodies directed against HCV epitopes contained in polypeptides derived from the cONAs may be used 
in processes based upon affinity chromatography to isolate the virus. Alternatively, the antibodies may be 

5 used to identify viral particles Isolated by other techniques. The viral antigens and the genomic material 
within the isolated viral particles may then be further characterized. 

In addition to the above, the information provided infra allows the identification of additional HCV strains 
or isolates. The isolation and characterization of the additional HCV strains or isolates may be accomplished 
by Isolating the nucleic acids from body components which contain viral particles and/or viral RNA, creating 

10 cDNA libraries using polynucleotide probes based on the HCV cDNA probes described infra., screening the 
libraries for clones containing HCV cDNA sequences described infra., and comparing the HCV cDNAs from 
the new isolates with the cDNAs described infra. The polypeptides encoded therein, or in the viral genome, 
may be monitored for immunological cross-reactivity utilizing the polypeptides and antibodies described 
supra. Strains or isolates which fit within the parameters of HCV, as described in the Definitions section, 

15 supra., are readily identifiable. Other methods for identifying HCV strains will be obvious to those of skill in 
the art, based upon the information provided herein. 



Isolation of the HCV cDNA Sequences 

20 

The novel HCV cDNA sequences described infra, extend the sequence of the cDNA to the HCV 
genome reported in EPO Pub. No. 318,216. The sequences which are present in clones bH4a, 18g, ag30a, 
CA205a, CA290a, CA216a, piHa, CA167b, CA156e, CA84a, and CA59a lie upstream of the reported 
sequence, and when compiled, yield nucleotides nos. -319 to 1348 of the composite HCV cDNA sequence. 

25 (The negative number on a nucleotide indicates its distance upstream of the nucleotide which starts the 
putative initiator MET codon.) The sequences which are present in clones b5a and 16Jh He downstream of 
the reported sequence, and yield nucleotides nos. 8659 to 8866 of the composite sequence. The composite 
HCV cDNA sequence which includes the sequences in the aforementioned clones, is shown in Fig. 17. 
The novel HCV cDNAs described herein were isolated from a number of HCV cDNA libraries, including 

30 the "C library present in lambda gt11 (ATCC No. 40394). The HCV cDNA libraries were constructed using 
pooled serum from a chimpanzee with chronic HCV Infection and containing a high titer of the virus, i.e., at 
least 10 s chimp infectious doses/ml (CID/ml). The pooled serum was used to isolate viral particles; nucleic 
acids isolated from these particles was used as the template in the construe tion of cDNA libraries to the 
viral genome. The procedures for isolation of putative HCV particles and for constructing the n c" HCV cDNA 

35 library Is described in EPO Pub. No. 318.216. Other methods for constructing HCV cDNA libraries are 
known in the art and some of these methods are described infra., in the Examples. Isolation of the 
sequences was by screening the libraries using synthetic polynucleotide probes, the sequences of which 
were derived from the s'-region and the 3'-region of the known HCV cDNA sequence. The description of 
the method to retrive the cDNa sequences is mostly ot historical interest. The resultant sequences (and 

40 their complements) are provided herein, and the sequences, or any portion thereof, could be prepared 
using synthetic methods, or by a combination of synthetic methods with retrieval of partial sequences using 
methods similar to those described herein. 



<s Preparation of Viral Polypeptides and Fragments 

The availability of HCV cDNA sequences, or nucleotide sequences derived therefrom (including 
segments and modifications ot the sequence), permits the construction ot expression vectors encoding 
antigenically active regions ot the polypeptide encoded in either strand. These antigenically active regions 

50 may be derived from coat or envelope antigens or from core antigens, or from antigens which are non- 
structural including, for example, polynucleotide binding proteins, polynucleotide polymerase(s), and other 
viral proteins required for the replication and/or assembly of the virus particle. Fragments encoding the 
desired polypeptides are derived from the cDNA clones using conventional restriction digestion or by 
synthetic methods, and are ligated Into vectors which may, for example, contain portions of fusion 

55 sequences such as beta-gaiactosidase or superoxide dismutase (SOD), preferably SOO. Methods and 
vectors which are useful for the production of polypeptides which contain fusion sequences of SOD are 
described in European Patent Office Publication number 0196056, published October 1, 1986. Vectors for 
the expression of fusion polypeptides of SOD and HCV polypeptides encoded In a number of HCV clones 
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u a i tr» in th* Evamoles Any desired portion of the HCV cDNA containing an open reading 
™ d T2^^^ « a^ecombinant polypeptide, such as a mature or fusion 

frame, in either sense strand, can oe oo«. ch emical synthesis, 

protein; alternatively, a polypeptide encoded ^ ec ^^ n D ! J P? J Qr m ^ re torm? and whe ther or not 

DMA encodng * * ^ ^ be S^5^ — * f0r 

containing are presently used in forming recombinant 

convenient host. Both eukaryoac ana P'™»y rnr r, mn n control svstems and host cell lines is given 
poiypeptides ^3 0^:^1^.- and purified to the extent 

infra. The polypeptide Is then isolated rrom .yseuM c* » . h ^ , example, differential 

needed for its intended use. Purification may be by * e f ^^^^ Sialography, centrifuge- 
extras, sat. fracflonation ^omatography ££Z^Z*^£* forcing proteins, 
tion. and the like. See. for example, Methods m Wnowrn neutrali2ing antibodies may be 

Such polypeptides can ^^^^S^^^S^ ^ » used 38 di39n ° StiCS< °' 

isolating and identifying HCV particles. 

Preparation of Antigenic Polypeptide and Conjugation with Carrier 

eimidomethyl)cyclohexane-l«arboxylate (SMCC) obtaned from Pierce py reagents 

S T "S^IS^mS^vL. ceMose. atria, beads am the i*k »ly™*= •"» 

skilled in the art ^mnrfdnn truncated HCV amino acid sequences 

. ing such truncated sequences can oe useo «. reduction or vaccines. While these truncated 

candidate subunh antigens in compositions for anbserum producfi n ■ or woe preferred to 

sequences can be produced by various known » ea ™ n * composing these 

male synthetic or recombinant polypeptides comp ^ "J^ ESTSnw. epitopes either 

truncated HCV ^*^™*™ZZ^ t^lP^L In a fusion protein. Useful 
H contiguous or "oncontiguou )^ * «^JJJJ™ fr0 m a recombinant host, enhance the 

heterologous sequences Inc ude ^sequences tn ^ P rov ^ e ;° me coupling of the polypeptide to an 
immunological reactivity of the HCV epitopes). 0 ^™* N 7 116 201: U.S. Pat. No. 4,722.840: EPO 
immunoassay support or a vaccine earner. See. e.g.. EPO Pub. No. iib.^i. 
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Pub, No. 259,149; U.S. Pat. No. 4,629,783, the disclosures of which are incorporated herein by reference. 

The si2e of polypeptides comprising the truncated HCV sequences can vary widely, the minimum size 
being a sequence of sufficient si2e to provide an HCV epitope, while the maximum size is not critical. For 
convenience, the maximum size usually is not substantially greater than that required to provide the desired 

s HCV epitopes and functions) of the heterologous sequence, if any. Typically, the truncated HCV amino acid 
sequence will range from about 5 to about 100 amino acids fn length. More typically, however, the HCV 
sequence will be a maximum of about 50 amino acids in length, preferably a maximum of about 30 amino 
adds. It is usually desirable to select HCV sequences of at least about 10, 12 or 15 amino adds, up to a 
maximum of about 20 or 25 amino acids. 

w Truncated HCV amino add sequences comprising epitopes can be identified in a number of ways. For 
example, the entire viral protein sequence can be screened by preparing a series of short peptides that 
together span the entire protein sequence. An example of antigenic screening of the regions of the HCV 
polyprotein is shown infra. In addition, by starting with, for example, 100mer polypeptides, it would be 
routine to test each polypeptide for the presence of epitope(s) showing a desired reactivity, and then testing 

is progressively smaller and overlapping fragments from an identified 100mer to map the epitope of interest. 
Screening such peptides in an Immunoassay is within the skill of the art. It is also known to carry out a 
computer analysis of a protein sequence to identify potential epitopes, and then prepare oligopeptides 
comprising the identified regions for screening. Such a computer analysis of the HCV amino add sequence 
is shown in Fig. 20, where the hydrophflic/hydrophobic character is displayed above the antigen index. The 

20 amino acids are numbered from the starting MET (position 1) as shown in Fig. 17. It is appredated by those 
of skill in the art that such computer analysis of antigenicity does not always identify an epitope that 
actually exists, and can also incorrectly identify a region of the protein as containing an epitope. Examples 
of HCV amino acid sequences that may be useful, which are expressed from expression vectors comprised 
of clones 5-1-1, 81, CA74a ( 35f. 279a, C38, C33b, CA290a, C8f, C12f, 14c. 15e, C25c, C33c> C33f. 33g, 

25 C39c, C40b, CA167b are described infra. Other examples of HCV amino add sequences that may be useful 
as described herein are set forth below. It is to be understood that these peptides do not necessarily 
precisely map one epitope, and may also contain HCV sequence that is not immunogenic. These non- 
immunogenic portions of the sequence can be defined as described above using conventional techniques 
and deleted from the described sequences. Further, additional truncated HCV amino acid sequences that 

30 comprise an epitope or are immunogenic can be Identified as described above. The following sequences 
are given by amino acid number (i.e., w AAn rt ) where n is the amino acid number as shown in Fig. 17: 
AA1-AA25; AA1-AA50; AA1-AA84; AA9-AA177; AA1-AA10; AA5-AA20; AA20-AA25; AA35-AA45; AA50- 
AA100; AA40-AA90; AA45-AA65; AA65-AA75; AA80-90; AA99-AA120; AA95-AA110: AA105-AA120; AA100- 
AA150; AA150-AA200; AA155-AA170; AA190-AA210: AA200-AA250; AA220-AA240; AA245-AA265; AA250- 

35 AA300; AA290-AA330; AA290-305; AA3G0-AA350; AA310-AA330; AA350-AA400; AA380-AA395; AA405- 
AA495; AA400-AA450: AA4Q5-AA415; AA415-AA425; AA425-AA435; AA437-AA582; AA450-AA500; AA440- 
AA460: AA460-AA470; AA475-AA495; AA500-AA550; AA511-AA690; AA515-AA550; AA550-AA600; AA550- 
AA625; AA575-AA605; AA585-AA600; AA600-AA650; AA600-AA625; AA635-AA665; AA650-AA700; AA645- 
AA680; AA700-AA750; AA70OAA725; AA700-AA750; AA725-AA775; AA770-AA790; AA750-AA8Q0; AA800- 

40 AA815; AA825-AA850; AA850-AA875; AA800-AA850; AA920-AA990; AA850-AA900; AA920-AA945; AA940- 
AA965; AA970-AA990, AA950-AA1000; AA1000-AA1060; AA1000-AA1025; AA1000-AA1050; AA1025- 
AA1040; AA1040-AA1055; AA1075-AA1175; AA1050-AA1200; AA1070-AA1100; AA11CO-AA1 130: AA1140- 
AA1165; AA1192-AA1457; AA1195-AA1250; AA1200-AA1225; AA1225-AA1250: AA1250-AA1300; AA1260- 
AA1310; AA1260-AA1280; AA1266-AA1428; AA1300-AA1350; AA1290-AA1310; AA1310-AA1340: AA1345- 

45 AA1405; AA1345-AA1365; AA1350-AA1400; AA1365-AA1380; AA1 380-AA1405; AA1400-AA1450; AA1450- 
AA1500; AA1460-AA1475; AA1 475-AA1 51 5; AA1 475-AA1 500; AA1500-AA1550; AA1500-AA1515; AA1515- 
AA1550; AA1550-AA1600; AA1545-AA1560; AA1569-AA1931; AA1570-AA1590; AA1595-AA1610; AA1590- 
AA1650: AA161CAA1645; AA1650-AA1690; AA1685-AA1770; AA1689-AA1805; AA1690-AA1720; AA1694- 
AA1735: AA1720-AA1745; AA1745-AA1770; AA175f>AA1800; AA1775-AA1810; AA1795-AA1850; AA1850- 

so AA1900: AA1 900- AA1 950;"AA1 900-AA1 920; AA1916-AA2021; AA1920-AA1940; AA1949-AA2124; AA195f> 
AA2000; AA1950-AA1985; AA1 980-AA2000; AA200O-AA2050; AA2005-AA2025; AA2020^AA2045; AA2045- 
AA2100; AA2045-AA2070; AA2054-AA2223; AA2070-AA2100; AA2100-AA2150; AA215OAA2200; AA22Q0- 
AA2250; AA2200-AA2325; AA2250-AA2330; AA2255-AA2270; AA2265-AA2280: AA2280-AA2290; AA2287- 
AA2385; AA2300-AA2350; AA2290-AA231 0; AA2310-AA2330; AA2330-AA2350: AA2350-AA2400: AA2348- 

55 AA2464; AA2345-AA241 5; AA2345-AA2375; AA2370-AA2410: AA2371-AA2502; AA2400-AA2450: AA2400- 
AA2425; AA241 5-AA2450; AA2445-AA2500; AA2445-AA2475; AA2470-AA2490; AA25Q0-AA2550; AA2505- 
AA2540; AA2535-AA2560; AA255OAA2600: AA2560-AA2580; AA2600-AA2650; AA2605-AA2620; AA2620- 
AA2650; AA2640-AA2660; AA2650-AA2700; AA2655-AA2670; AA2670-AA2700; AA2700-AA2750; AA2740- 
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2S SSKSS VSSSt SSKSS =SS SS 

AA2930:AA2925-AA2950; AA2945^nd(C «•"■•«■«■ ^ or (nc0 rporated into a 

prediction of the putative domains of the HCV non structural i jp ^ Moreover, these also 

NS proteins in the putative Flavivirus P™™»«"^m are tort yj^wm^ 
„ coincide « observed gross «ns ^ the JJ^^f J^S* with a comp.ement 
that NS5 of Flavivlruses encodes the virion ^ fe Recerrtlyi it has shown 

fixation antigen which has been shown to be a eff acM vaca ^ ^ ^ 
that a flavivirai protease function resides n NS* Due t the ^.^^^ of the corresponding protein 
Viruses ^ 

, 5 domains and functions in the HCV poW«wn"P inc|udina for exa mple, bacteria, yeast, insect and 

rr r;»^r^ wh? ch ca n be - ^ 

"^"ructura, prote, regions of ^^^X^lT^ 
30 herein and of Fiaviviruses appear to have " ~ is a greater divergence in 

structural regions which are towards *° ' of "1^ regions show less similarity. This 

sequence, and in addi fen. the hydrophobe tg*«£ N S domLn in HCV. and extends to the 
"divergence" begins in the N-term-nal reg* of N £ m approximate locations of the 

presumed N-temiinus. Nevertheless. ■ may M be P°«' b '° * P hydrophobic) domains within the HCV 
putative nucleocapsid (N-termina. E J ^^^^Ced in the hyarophobio profile 
^o, ypro teln. in the Examples the •"Jf^'T'^iKS of the flavivirai proteins. From 
of the HCV polyprotein. and on a knowledge of the location an a c n ar ^ ^ 

these predictions rt may be poss,ble to Identfy ^.ons of proteins of Fiaviviruses are 

correspond with useful immunology. Mjh The w reajons w weH as some which are shown to be 
known to have efficacy as protect^vacclnes ™«« ^'°" s - J ™ In tatlve NS3 . C , and NS5, etc.. 
antigenic in the HCV isolate described herem. for exampl > ttose J»P ^.-encoded enzymes 
should also provide diagnostic reagents. Moreover, the location an «« «P^» which prevent 

may also ,.ow the evaluation of J^^^^^^S^ which may prevent 
3S ZZSZ (for ZXSX -Vo/other drugs wnich interfere * expression, 

P I 6 P aratton of Hybrid Particle Immunogens Containing HCV Epitopes 

The immunogenics of the epitopes of HCV may jjo be ^^^^ h rSSlS 
yeast systems fused with or assemb^dw* p«teM JJ*^^ d.rec't.y to the particle- 

with hepatitis B surface antigen. ^^t/hvTrids which are immunogenic with respect to the HCV 

(1984)). The formation of such partdes has been snown to e . comprising the 55 amino 

subuni. The constructs may also include the ^f P ^l ^ pre-S-HBSAg parties 

acids of the presurface rf ^^^ - pJ^LS^I 1986; hybrids including hetem.o- 
expressible in yeast are EPO 175.261. published March 26. 1966. These 

^l^r^Xs^ZZ^^ Chinese hamster ovary (CHO, cells using an 
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H8V antigenic sites from competition with the HCV epitope. 



Preparation of Vaccines 

s 

Vaccines may be prepared from one or more immunogenic polypeptides derived from HCV cDNA, 
including the cDNA sequences described in the Examples. The observed homology between HCV and 
Flavjviruses provides information concerning the polypeptides which may be most effective as vaccines, as 
well as the regions of the genome in which they are encoded. The genera) structure of the Flavivirus 

10 genome is discussed in Rice et al (1986). The flavivirus genomic BNA Is believed to be the only virus- 
specific mRNA species, and it is translated into the three viral structural proteins, i.e., C, M, and E, as well 
as two large nonstructural proteins, NS4 and NS5, and a complex set of smaller nonstructural proteins, ft is 
known that major neutralizing epitopes for Fiavh/iruses reside in the E (envelope) protein (Roehrig (1986)). 
Thus, vaccines may be comprised of recombinant polypeptides containing epitopes of HCV E. These 

15 polypeptides may be expressed in bacteria, yeast, or mammalian cells, or alternatively may be Isolated 
from viral preparations. It is also anticipated that the other structural proteins may also contain epitopes 
which give rise to protective anti-HCV antibodies. TTius. polypeptides containing the epitopes of E, C, and M 
may also be used, whether singly or in combination, in HCV vaccines. 

In addition to the above, it has been shown that immunization with NS1 (nonstructural protein 1), results 

20 in protection against yellow fever (Schiesinger et al (1986)). This is true even though the immunization does 
not give rise to neutralizing antibodies. Thus, particularly since this protein appears to be highly conserved 
among Flaviviruses, it is likely that HCV NS1 will also be protective against HCV infection. Moreover, it also 
shows that nonstructural proteins may provide protection against viral pathogenicity, even if they do not 
cause the production of neutralizing antibodies. 

25 The information provided in the Examples concerning the immunogenicity of the polypeptides ex- 
pressed from cloned HCV cDNAs which span the various regions of the HCV ORF also allows predictions 
concerning their use in vaccines. 

In view of the above, multivalent vaccines against HCV may be comprised of one or more epitopes 
from one or more structural proteins, and/or one or more epitopes from one or more nonstructural proteins. 

30 These vaccines may be comprised of, for example, recombinant HCV polypeptides and/or polypeptides 
isolated from the virions. In particular, vaccines are contemplated comprising one or more of the following 
HCV proteins, or subunit antigens derived therefrom: E, NS1, C, NS2, NS3, NS4 and NS5. Particularly 
preferred are vaccines comprising E and/or NS1, or subunits thereof. 

The preparation of vaccines which contain an immunogenic polypeptide^) as active Ingredients, is 

as known to one skilled in the art Typically, such vaccines are prepared as injectables, either as liquid 
solutions or suspensions; solid forms suitable for solution in, or suspension in, liquid prior to injection may 
also be prepared. The preparation may also be emulsified, or the protein encapsulated in liposomes. The 
active immunogenic ingredients are often mixed with exdpients which are pharmaceutical^ acceptable and 
compatible with the active Ingredient. Suitable excipierrts are. for example, water, saline, dextrose, glycerol, 

4o ethanol, or the like and combinations thereof. In addition, if desired, the vaccine may contain minor amounts 
of auxiliary substances such as wetting or emulsifying agents, pH buffering agents, and/or ad)uvants which 
enhance the effectiveness of the vaccine. Examples of adjuvants which may be effective include but are not 
limited to: aluminum hydroxide. N-acetyl-muramyl-L-threonyl-Wsoglutamine (thr-MDP), N-acetyl-nor- 
muramyl-L-alanyl-D-isoglutamine (CGP 11637. referred to as nor-MDP), N-acetylmuramyl-L-alanyl-D- 

45 isoglutamfnyl-l-alanine-2-(1 '-a'^ipalmitoyl-sn-gJycero-a-hydroxyphosphoryloxyVethylamine (CGP 1 9835A, 
referred to as MTP-PE), and RIBI, which contains three components extracted from bacteria, mon- 
ophosphoryl lipfd A, trehalose dimycolate and cell wall skeleton (MPL + TDM + CWS) in a 2% 
squalene/Tween 60 emulsion. The effectiveness of an adjuvant may be determined by measuring the 
amount of antibodies directed against an immunogenic polypeptide containing an HCV antigenic sequence 

so resulting from administration of this polypeptide in vaccines which are also comprised of the various 
adjuvants. 

The vaccines are conventionally administered parenterally, by injection, for example, either sub- 
cutaneously or intramuscularly. Additional formulations which are suitable for other modes of administration 
include suppositories and, in some cases, oral formulations. For suppositories, traditional binders and 
55 carriers may include, for example, polyalkylene glycols or triglycerides: such suppositories may be formed 
from mixtures containing the active ingredient in the range of 0.5% to 10%, preferably 1%«2%. Oral 
formulations Include such normally employed excipients as, for example, pharmaceutical grades of 
rnannitol, lactose, starch, magnesium stearate, sodium saccharine, cellulose, magnesium carbonate, and the 
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like. These compositions take the form of solutions, suspensions, tablets, pills, capsules, sustained release 
formulations or powders and contain t0%-95% of active ingredient, preferably 25%-70%. 

The proteins may be formulated into the vaccine as neutral or salt forms. Pharmaceutically acceptable 
salts include the acid addition salts (formed with free amino groups of the peptide) and which are formed 
5 with inorganic acids such as. for example, hydrochloric or phosphoric acids, or such organic acids such as 
acetic, oxalic, tartaric, maleic, and the like. Salts formed with the free carboxyl groups may also be derived 
from inorganic bases such as, for example, sodium, potassium, ammonium, calcium, or ferric hydroxides, 
and such organic bases as isopropylamine, trim ethy lam ine, 2-ethylamino ethanol, histidine, procaine, and 
the like. 



Dosage and Administration of Vaccines 

The vaccines are administered in a manner compatible with the dosage formulation, and in such 
is amount as will be prophylactically and/or therapeutically effective. The quantity to be administered, which Is 
generally in the range of 5 micrograms to 250 micrograms of antigen per dose, depends on the subject to 
be treated, capacity of the subject's immune system to synthesize antibodies, and the degree of protection 
desired. Precise amounts of active ingredient required to be administered may depend on the judgment of 
the practitioner and may be peculiar to each subject 
20 The vaccine may be given in a single dose schedule, or preferably In a multiple dose schedule. A 
multiple dose schedule is one in which a primary course of vaccination may be with 1-10 separate doses, 
followed by other doses given at subsequent time Intervals required to maintain and or reenforce the 
immune response, for example, at 1-4 months for a second dose, and if needed, a subsequent dose(s) after 
several months. The dosage regimen will also, at least in part, be determined by the need of the individual 
25 and be dependent upon the judgment of the practitioner. 

In addition, the vaccine containing the immunogenic HCV antigen(s) may be administered in conjunction 
with other immunoregulatory agents, for example, Immune globulins. 



30 Preparation of Antibodies Against HCV Epitopes 

The immunogenic polypeptides prepared as described above are used to produce antibodies, both 

polyclonal and monoclonal. If polyclonal antibodies are desired, a selected mammal (e.g., mouse, rabbit. 

goat, horse, etc.) is immunized with an immunogenic polypeptide bearing an HCV epitope(s). Serum from 
35 the immunized animal is collected and treated according to known procedures. If serum containing 

polyclonal antibodies to an HCV epitope contains antibodies to other antigens, the polyclonal antibodies can 

be purified by immunoaffinity chromatography. Techniques for producing and processing polyclonal 

antisera are known in the. art, see for example, Mayer and Waiker (1987), 

Alternatively, polyclonal antibodies may be isolated from a mammal which has been previously infected 
40 with HCV. An example of a method for purifying antibodies to HCV epitopes from serum from an Infected 

individual, based upon affinity chromatography and utilizing a fusion polypeptide of SOD and a polypeptide 

encoded within cDNA clone 5-1-1, is presented in EPO Pub. No. 318,216, 

Monoclonal antibodies directed against HCV epitopes can also be readily produced by one skilled in 

the art. The general methodology for making monoclonal antibodies by hybridomas is wed known. Immortal 
45 antibody-producing cell lines can be created by cell fusion, and also by other techniques such as direct 

transformation of 8 lymphocytes with oncogenic DNA. or transfection with Epstein-Barr virus, See, e.g., M. 

Schreier et ai. (1980); Hammertlng et al. (1981); Kennett et al. (1980); see also, U.S. Patent Nos. 4,341,761; 

4.399,121; 4,427,783; 4,444,887; 4,466,917; 4,472,500; 4,491,632; and 4,493,890. Panels of monoclonal 

antibodies produced against HCV epitopes can be screened for various properties; i.e., for isotype, epitope 
so affinity, etc. 

Antibodies, both monoclonal and polyclonal, which are directed against HCV epitopes are particularly 
useful in diagnosis, and those which are neutralizing are useful in passive immunotherapy. Monoclonal 
antibodies, in particular, may be used to raise antiidiotype antibodies. 

Antiidiotype antibodies are immunoglobulins which carry an "internal image" of the antigen of the 
55 infectious agent against which protection is desired. See, for example, Nisonoff, A., et al. (1981) and 
Dreesman et al. (1985). 

Techniques for raising antiidiotype antibodies are known in the art See, for example. Grzych (1985), 
MacNamara et al. (1984), and Uytdehaag et al. (1985). These antiidiotype antibodies may also be useful for 
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treatment and/or diagnosis of NANBH, as well as for an elucidation of the immunogenic regions of HCV 
antigens. 

It wouid also be recognized by one of ordinary skill in the art that a variety of types of antibodies 
directed against HCV epitopes may be produced. As used herein, the term "antibody* refers to a 

s polypeptide or group of polypeptides which are comprised of at least one antibody combining site. An 
"antibody combining site* or "binding domain" is formed from the folding of variable domains of an 
antibody molecule(s) to form three-dimensional binding spaces with an internal surface shape and charge 
distribution complementary to the features of an epitope of an antigen, which allows an immunological 
reaction with the antigen. An antibody combining site may be formed from a heavy and/or a light chain 

ro domain (VH and VL, respectively), which form hypervariable loops which contribute to antigen binding. The 
term "antibody" includes, for example, vertebrate antibodies, hybrid antibodies, chimeric antibodies, altered 
antibodies, univalent antibodies, the Fab proteins, and single domain antibodies. 

A "single domain antibody 0 (dAb) is an antibody which is comprised of an VH domain, which reacts 
immunologically with a designated antigen. A dAB does not contain a VL domain, but may contain other 

;5 antigen binding domains known to exist in antibodies, for example, the kappa and lambda domains. 
Methods for preparing dABs are known in the art. See. for example, Ward et al. (1989). 

Antibodies may also be comprised of VH and VL domains, as well as other known antigen binding 
domains. Examples of these types of antibodies and methods for their preparation are known in the art (see, 
e.g., U.S. Patent No. 4,816,467, which is incorporated herein by reference), and include the following. For 

20 example, "vertebrate antibodies" refers to antibodies which are tetramers or aggregates thereof, comprising 
light and heavy chains which are usually aggregated in a "Y" configuration and which may or may not have 
covalent linkages between the chains. In vertebrate antibodies, the amino acid sequences of all the chains 
of a particular antibody are homologous with the chains found in one antibody produced by the lymphocyte 
which produces that antibody in situ, or in vitro (for example, in hybridomas). Vertebrate antibodies 

25 typicallly include native antibodies, for example, purified polyclonal antibodies and monoclonal antibodies. 
Examples of the methods for the preparation of these antibodies are described infra. 

"Hybrid antibodies" are antibodies wherein one pair of heavy and light chains is homologous to those In 
a first antibody, while the other pair of heavy and light chains is homologous to those in a different second 
antibody. Typically, each of these two pairs will bind different epitopes, particularly on different antigens. 

ao This results in the property of "divalence", i.e., the ability to bind two antigens simultaneously. Such hybrids 
may also be formed using chimeric chains, as set forth below. 

"Chimeric antibodies", are antibodies in which the heavy and/or fight chains are fusion proteins. 
Typically the constant domain of the chains is from one particular species and/or class, and the variable 
domains are from a different species and/or class. Also included is any antibody in which either or both of 

os the heavy or light chains are composed of combinations of sequences mimicking the sequences in 
antibodies of different sources, whether these sources be differing classes, or different species of origin, 
and whether or not the fusion point is at the variable/constant boundary. Thus, it ts possible to produce 
antibodies in which neither the constant nor the variable region mimic known antibody sequences. It then 
becomes possible, for example, to construct antibodies whose variable region has a higher specific affinity 

40 for a particular antigen, or whose constant region can elicit enhanced complement fixation, or to make other 
improvements in properties possessed by a particular constant region. 

Another example is "altered antibodies", which refers to antibodies in which the naturally occurring 
amino acid sequence in a vertebrate antibody has been varied. Utilizing recombinant DNA techniques, 
antibodies can be redesigned to obtain desired characteristics. The possible variations are many, and range 

45 from the changing of one or more amino acids to the complete redesign of a region, for example, the 
constant region. Changes in the constant region, in general, to attain desired cellular process characteris- 
tics, e.g., changes in complement fixation, interaction with membranes, and other effector functions. 
Changes in the variable region may be made to alter antigen binding characeristics. The antibody may also 
be engineered to aid the specific delivery of a molecule or substance to a specific cell or tissue site. The 

so desired alterations may be made by known techniques In molecular biology, e.g., recombinant techniques, 
site directed mutagenesis, etc. 

Yet another example are "univalent antibodies", which are aggregates comprised of a heavy chain/light 
chain dimer bound to the Fc (i.e., constant) region of a second heavy chain. This type of antibody escapes 
antigenic modulation. See, e.g., Glennie et al. (1982). 

55 Included also within the definition of antibodies are "Fab" fragments of antibodies. The Tab" region 
refers to those portions of the heavy and light chains which are roughly equivalent, or analogous, to the 
sequences which comprise the branch portion of the heavy and light chains, and which have been shown to 
exhibit immunological binding to a specified antigen, but which lack the effector Fc portion . "Fab" includes 
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aggregates of one heavy and one light chain (commonly known as Fab'), as well as tetramers containing 
the 2H and 2L chains (referred to as Ffabh), which are capable of selectively reacting with a designated 
antigen or antigen family. "Fab" antibodies may be divided into subsets analogous to those described 
above, i.e. 'Vertebrate Fab", "hybrid Fab*, "chimeric Fab\ and "altered Fab". Methods of producing "Fab" 
5 fragments of antibodies are* known within the art and include, for example, proteolysis, and synthesis by 
recombinant techniques. 

II, H. Diagnostic Oligonucleotide Probes and Kits 

w 

Using the disclosed portions of the isolated HCV cDNAs as a basis, oligomers of approximately 8 
nucleotides or more can be prepared, either by excision or synthetically, which hybridize with the HCV 
genome and are useful in identification of the viral agent(s), further characterization of the viral genome(s), 
as well as in detection of the virus(es) in diseased individuals. The probes for HCV polynucleotides (natural 
is or derived) are a length which allows the detection of unique viral sequences by hybridization. While 6-8 
nucleotides may be a workable length, sequences of 10-12 nucleotides are preferred, and about 20 
nucleotides appears optimal. Preferably, these sequences will derive from regions which lack heterogeneity. 
These probes can be prepared using routine methods, including automated oligonucleotide synthetic 
methods. Among useful probes, for example, are those derived from the newly isolated clones disclosed 
20 herein, as well as the various oligomers useful in probing cONA libraries, set forth below. A complement to 
any unique portion of the HCV genome will be satisfactory. For use as probes, complete complementarity is 
desirable, though it may be unnecessary as the length of the fragment is increased. 

For use of such probes as diagnostics, the biological sample to be analyzed, such as blood or serum, 
may be treated, if desired, to extract the nucleic acids contained therein. The resulting nucleic add from the 
25 sample may be subjected to gel electrophoresis or other size separation techniques; alternatively, the 
nucleic acid sample may be dot blotted without size separation. The probes are then labeled. Suitable 
labels, and methods for labeling probes are known in the art. and include, for example, radioactive labels 
incorporated by nick translation or kinasing, biotin. fluorescent probes, and chemlluminescent probes. The 
nucleic acids extracted from the sample are then treated with the labeled probe under hybridization 
30 conditions of suitable stringencies, and polynucleotide duplexes containing the probe are detected. 

The probes can be made completely complementary to the HCV genome. Therefore, usually high 
stringency conditions are desirable in order to prevent false positives. However, conditions of high 
stringency should only be used if the probes are complementary to regions of the viral genome which lack 
heterogeneity. The stringency of hybridization is determined by a number of factors during hybridization 
05 and during the washing procedure, including temperature, ionic strength, length of time, and concentration 
of formamide. These factors are outlined in. for example, Maniatis, T. (1982). 

Generally, it is expected that the HCV genome sequences will be present in serum of infected 
individuaJs at relatively low levels, i.e., at approximately lO^-IO 3 chimp infectious doses (CID) per ml. This 
level may require that amplification techniques be used in hybridization assays. Such techniques are known 
40 in the art. For example, the Enzo Biochemical Corporation "Bio-Bridge" system uses terminal deox- 
ynucleotide transferase to add unmodified 3'^poly-dT-tails to a DNA probe. The poly dT-tailed probe is 
hybridized to the target nucleotide sequence, and then to a biotin-modified poly-A. PCT application 
84/03520 and EPA1 24221 describe a DNA hybridization assay in which: (1) analyte is annealed to a single- 
stranded DNA probe that is complementary to an enzyme-labeled oligonucleotide; and (2) the resulting 
45 tailed duplex is hybridized to an enzyme-labeled oligonucleotide. EPA 204510 describes a DNA hybridiza- 
tion assay in which analyte DNA Is contacted with a probe that has a tail, such as a poly-dT tail, an 
amplifier strand that has a sequence that hybridizes to the tail of the probe, such as a poly-A sequence, and 
which is capable of binding a plurality of labeled strands. A particularly desirable technique may first involve 
amplification of the target HCV sequences in sera approximately 10,000 fold, i.e.. to approximately 10 6 
so sequences/ml. This may be accomplished, for example, by the polymerase chain reactions (PCB) technique 
described which is by Saiki et al (1986), by Muliis, U.S. Patent No. 4.683,195. and by Muilis et at. U.S. 
Patent No. 4,683.202. The amplified sequence(s) may then be detected using a hybridization assay which is 
described in EP 317,077. published May 24. 1989. These hybridization assays, which should detect 
sequences at the level of i0 6 /ml, utilize nucleic acid muttimers which bind to single-stranded analyte 
55 nucleic add. and which also bind to a multiplicity of single-stranded labeled oligonucleotides. A suitable 
solution phase sandwich assay which may be used with labeled polynucleotide probes, and the methods for 
the preparation of probes is described in EPO 225.807, published June 16. 1987. 

The probes can be packaged into diagnostic kits. Diagnostic kits include the probe DNA, which may be 
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labeled- alternatively, the probe DNA may be unlabeled and the Ingredients for labeling may be Included n 
ST S separate containers. The kit may also contain other suitably packaged reagents and matenals 
needed for tte particular hybridization protocol, for example, standards, as well as instructions for 
conducting the test. 

Immunoassay and Diagnostic Kits 

Both the polypeptides which react Immunologically with serum containing HCV antibodies for example 
those detected by the antigenic screening method described infra, in the Examples, as well those derived 
from or encoded within the isolated clones described in the Examples, and composites thereof, and the 
antibodies raised against the HCV specific epitopes in these polypeptides, are useful m immunoassays to 
aCtert presence of HCV antibodies, or the presence of the virus and/or viral antigens. in b.olog.cal samples. 
Des Jn o immunoassays is subject to a great deal of variation, and a variety of these are known in the 
art ?or example, the immunoassay may utilize one viral epitope: afternatively. .he immunoassay may use a 
clb?na?oTof vfral epitopes derived from these sources; these epitopes may be derived from the same or 
Tom Srent viral polypeptides, and may be in separate recombinant or natural pofypeptides, or together in 

Ital^StoSS a combination of monoclonal antibodies directed towards epitopes of one viral antigen 
mlctona^ic^dies directed towards epitopes of different v,a- antigens po.yc.ona. antibodies Mai 
towards the same viral antigen, or polyclonal antibodies directed towards different viral antigens Protocols 
may 2 SseTfor example upon competition, or direct reaction, or sandwich type assays. Protocols may 
lo for example, use solid supports, or may be by Immunoprecipitation. Most assays .nvolve the use of 
Sled anSy or polypeptide; the labels may be. for example, fluorescent, chemiluminescent. radioactive 
or dye molcules. Assays which amplify the signals from the probe are also known; examples of which are 
Lays which utilize biotjn and avidin. and enzyme-labeled and mediated Immunoassays, such as EL.SA 

^So'me of the antigenic regions of the putative polyprotein have been mapped and identified by 
screen£g the antigenic^ of bacterial expression products of HCV cDNAs wh.ch encode portions of the 
SSein. See the Examples. Other antigenic regions of HCV may be detected by expressing the portions 
Stto HCV CDNAs in other expression systems. Including yeast systems and cellular systems denved from 
hnects and vertebrates. In addition, studies giving rise to an ant.gen.cty .ndex and 
hTdrophobTcSy/hydrophilidty profile give rise to Information concerning the probab.lity of a reg.on's 

^"fSudies on antigenic mapping by expression of HCV cONAs showed that a number of clones 
containing these cDNAs expressed polypeptides which were immunologically react^e with serum from 
Siduafs with NANBH. No single polypeptide was immunologically reactive with all sera, five of these 
%SSS£ we« very immuno'genlc in that antibodies to the HCV epitopes In these pc<mM»» 
Scted in many different patient sera, although the overlap in detection was not complete Thus^he 
« r^ on the immunogenic^ of the polypeptides encoded in the various clones suggest tha effecient 
deteSon systems ma? include tine use of panels of epitopes. The epitopes in the panel may be 
constructed into one or multiple polypeptides. 

stable for immunodiagnosis and containing the appropriate labeled reagents are const ructed ^by 
packaging the appropriate materials, including the polypeptides of the invention contaming HCV eprtopes or 
« directed against HCV epitopes in suitable containers, along wtth the rema.n.ng reagents and 

materials required for the conduct of the assay, as well as a suitable set of assay .reactions. 



Further Characterization of the HCV genome. Virions, and Viral Antigens Using Probes Derived From cDNA 
so to the Viral Genome 

The HCV cDNA sequence information in the newly isolated clones described in the Examples may be 
used to gain further information on the sequence of the HCV genome, and for identification and Ration o 
the HCV agent, and thus will aid in its characterization including the nature of the genome, the structure of 
S5 the viral particle, and the nature of the antigens of which it is composed. This irrformationjn turn, can lead 
to additional polynucleotide probes, polypeptides derived from the HCV genome ^ and anttedies directed 
aaainst HCV epitopes which would be useful for the diagnosis and/or treatment of HCV caused NANBH. 
The cDNA sequence Information In the abovementioned clones is useful for the design of probes for the 
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■ . *• m actional rDNA seouences which are derived from as yet undefined regions of the HCV 
** ^f.™ wWch the cDNAsTctones described herein and in EP 0,316,218 are derived. For example. 
EE^2^?%^« approximately 8 or more nucleotides, and preferably 20 or more 
£££ whST^e derived from regions ciose to the S-termini or 3 -termini of the compose HCV 
STSSeS *Z° Fig. 17 may be used to isolate overlapping cDNA sequences from HCV cDNA 
nSes Stiver chLcirization of the genomic segments could be from the viral genome® isolated 
ibraries. a , HCV Ms ^ for detecting them dunng the 

purrhcati^ procedure are obs procedure which may be used is that described in EP 

oSa^The 3 soS Vnon^c Sgm^ts° n cou P |d then be Coned and sequenced^ An examp. , o. ft* 
whTch utilizes Lp.ificat.on of the sequences to be o.oned, is prov.ded ,nfra.. and y.e.ded Cone 

16ih * „ ^ r nwA lihraries are known in the art. and are discussed supra and infra; a 

^^SSSSS are useful for screening with nucleic acid probes may aiso be constructed in 
oSTSm -tarn m the art. for example. lambda-gtlO (Huynh et al. (1985)). 

S creening for Anti-Viral Agents for HCV 

4„ without causing cell death. are knQVvn jn tte ^ ^ 

The anti-v.ra. ^^.^^^ virion Smponents and/or cellular components which are 
include. <or e^pl ^those wh chjt^d wtth iv.no p ^ 

OT aganta (non-co.al.mly attaahad » ^ » cJSa, polynaoteoHdes »hlcn 

ennance antvu. « rna mav ^ Hesioned based upon the sequence information of the HCV 

S they mi include anaiogs. attached proteins, substituted or altered bond.ng between bases, etc. 
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Other types of drugs may be based upon polynucleotides which "mimic" important control regions of 
the HCV genome, and which may be therapeutic due to their Interactions with key components of the 
system responsible for viraJ infectivity or replication. 

5 

General Methods 

The general techniques used in extracting the genome from a virus, preparing and probing a cDNA 
library, sequencing clones, constructing expression vectors, transforming cells, performing immunological 
w assays such as radioimmunoassays and ELISA assays, for growing cells in culture, and the like are known 
in the art and laboratory manuals are available describing these techniques. However, as a general guide, 
the following sets forth some sources currently available for such procedures, and for materials useful in 
carrying them out 

Both prokaryotic and eukaryotic host cells may be used for expression of desired coding sequences 

15 when appropriate control sequences which are compatible with the designated host are used. Among 
prokaryotic hosts, E. colt is most frequently used. Expression control sequences for prokaryotes include 
promoters, optionally containing operator portions, and rlbosome binding sites. Transfer vectors compatible 
with prokaryotic hosts are commonly derived from, for example, pBR322, a plasmid containing operons 
conferring ampicillin and tetracycline resistance, and the various pUC vectors, which also contain se- 

20 quences conferring antibiotic resistance markers. These markers may be used to obtain successful 
transformants by selection. Commonly used prokaryotic control sequences Include the Beta-lactamase 
(penicillinase) and lactose promoter systems (Chang et al. (1977)), the tryptophan (trp) promoter system 
(Goeddel et al. (1980)) and the lambda-derived Pi promoter and N gene ribosome binding site (Shimatake 
et al. (1981)) and the hybrid tac promoter {De Boer et al. (1983)) derived from sequences of the trp and lac 

25 UV5 promoters. The foregoing^systems are particularly compatible with E. coli; if desired, other prokaryotic 
hosts such as strains of Bacillus or Pseudomonas may be used, with corresponding control sequences. 

Eukaryotic hosts include yeast and mammalian cells in culture systems. Saccharomyces cerevisiae and 
Saccharomyces carlsbergensis are the most commonly used yeast hosts, and are convenient fungal hosts. 
Yeast compatible vectors carry markers which permit selection of successful transformants by conferring 

30 prototrophy to auxotrophic mutants or resistance to heavy metals on wild-type strains. Yeast compatible 
vectors may employ the 2 micron origin of replication (Broach et al. (1983)), the combination of CEN3 and 
ARS1 or other means for assuring replication, such as sequences which will result in incorporation of an 
appropriate fragment into the host cell genome. Control sequences for yeast vectors are known in the art 
and include promoters tor the synthesis of glycolytic enzymes (Hess et al. (1968); Holland et al. (1978)), 

35 including the promoter for 3 phosphoglycerate kinase (Hitzeman (1980)). Terminators may also be included, 
such as those derived from the enolase gene (Holland (1981)). Particularly useful control systems are those 
which comprise the glyceraldehyde-3 phosphate dehydrogenase {GAPDH) promoter or alcohol de- 
hydrogenase (ADH) regulatable promoter, terminators also derived from GAPDH. and if secretion is desired, 
leader sequence from yeast alpha factor. In addition, the transcriptional regulatory region and the transcrip- 

40 tional initiation region which are operably linked may be such that they are not naturally associated in the 
wild-type organism. These systems are described in detail in EPO 120.551, published October 3, 1984; 
EPO 116,201. published August 22. 1984; and EPO 164.556. published December 18. 1985, all of which are 
assigned to the herein assignee, and are hereby incorporated herein by reference. 

Mammaiian cell lines available as hosts for expression are known in the art and Include many 

45 immortalized cell lines available from the American Type Culture Collection (ATCC), including HeLa cells, 
Chinese hamster ovary (CHO) cells, baby hamster kidney (BHK) cells, and a number of other cell lines. 
Suitable promoters for mammalian cells are also known in the art and include viral promoters such as that 
from Simian Virus 40 (SV40) (Rers (1978)), Rous sarcoma virus (RSV), adenovirus (ADV), and bovine 
papilloma virus (BPV). Mammalian cells may also require terminator sequences and poly A addition 

so sequences; enhancer sequences which increase expression may also be included, and sequences which 
cause amplification of the gene may also be desirable. These sequences are known In the art Vectors 
suitable for replication in mammalian cells may Include viral replicons, or sequences which insure 
integration of the appropriate sequences encoding NANBV epitopes into the host genome. 

Transformation may be by any known method for introducing polynucleotides Into a host ceil, including, 

55 for example packaging the polynucleotide in a virus and transducing a host cell with the virus, and by direct 
uptake of the polynucleotide. The transformation procedure used depends upon the host to be transformed. 
For example, transformation of the E. coli host cells with lambda-gtl 1 containing BB-NANBV sequences is 
discussed in the Example section, infra. Bacterial transformation by direct uptake generally employs 
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(,978), or the various known ""^J^^TL are Known In the art. Slte-specific ONA cleavage is 

^Zie or aga^se ge' etectrophoresis techniques, according to the genera, procedures found ,n 
Methods in Enzymology (1980) 65:499-560. Doi vm erase I (Klenow) In the 

presence of the "P^JP^^^S in the hydrolysis of any single stranded DNA portions. 
SI nuclease may also be ^ Xdmp l^ conditions using T4 ONA .igase and 

Ligations are earned out usmg fan^™ ^ d „ atloos . vec tor fragments 

ATP: sticky end ligations requ.re less ATP and les ^"""""J 9 ^ wjth bactenal alka |ine 

DNA sequences, including those isolated from cDNA ' ^ ONA to be 

including, for example site directed Sd^SS a cSe stranded ONA 

modified is packaged into phage * SSiS »mp.ernentary to the portion of the 
a, with DNA polymerase using, as a primer, a synfoeto ' ^ sequence. The resulting 

sequence. The sequer^s whic* "J GrunsteIn Hogness (1975). Briefiy, in this 

ONA libraries may be probed using the P»cMura or «™ denatured, and prehybrldized with 

procedure, the ONA to be probed is .mmo^zed I on (wW) eacn of Lrine serum 

« a buffer containing 0-50% i 75 " J* J ^ ^osSate pH 65). 0.1% SOS, and 100 

P T2lJ?rS SI 5 UenC 5 mSK tfe buffer! as we,, as the time and 
micrograms/ml carrier denatured dna. 1 ne perce » hybridization steps depends on the stnn- 

temperature conditions of the ^Sh^LTto^^^^ are generally used with low 
gency required. Oligomenc probes wh.ch require lower stnngency a morethan 

« 'percentages of formamlde. tower temper Jures and longer J^^^SmJq^ employ higher 
30 or 40 -a**^ - ^ prenybrtdiz , 
temperatures, e.g.. about 40-42 C. and a 1 r»gn PJJJJ ' ' 9 fl|t incubate d in this mixture 

don. ^-labeled ^^^rTtnt Seated S subjected to autoradiography to show 

. TJZrr^SX SS?£SS. -ocations on the origin, agar plates Is used as 
the source of the desired ONA. . , nrfflftH Wo p co n strain HB101 or other 
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deazoguanoslne according to Barr et al. (1986). 

The enzyme-linked immunosorbent assay (ELISA) can be used to measure either antigen or antibody 
concentrations. This method depends upon conjugation of an enzyme to either an antigen or an antibody, 
and uses the bound enzyme activity as a quantitative label. To measure antibody, the known antigen Is 

s fixed to a solid phase (e.g., a microplate or plastic cup), Incubated with test serum dilutions, washed, 
incubated with antiimmunoglobulin labeled with an enzyme, and washed again. Enzymes suitable for 
labeling are known in the art. and include, for example, horseradish peroxidase. Enzyme activity bound to 
the solid phase Is measured by adding the specific substrate, and determining product formation or 
substrate utilization colorimetricaily. The enzyme activity bound is a direct function of the amount of anti- 

w body bound. 

To measure antigen, a known specific antibody is fixed to the solid phase, the test material containing 
antigen Is added, after an incubation the solid phase is washed, and a second enzyme-labeled antibody is 
added. After washing, substrate is added, and enzyme activity is estimated colorimetricaily, and related to 
antigen concentration. 

15 

Examples 

20 Described below are examples of the present invention which are provided only for illustrative purposes, 
and not to limit the scope of the present invention. In light of the present disclosure, numerous 
embodiments within the scope of the claims will be apparent to those of ordinary skfll In the art. 

25 Isolation and Sequence of Overlapping HCV cONA Clones I3i, 26j, CA59a, CA84a, CA156e and CA167b 

The clones 13i, 26j, CA59a, CA84a, CA156e and CA167b were isolated from the Iambda-gt11 library 
which contains HCV cDNA (ATCC No. 40394), the preparation of which is described in EPO Pub. No. 
30 318,216 (published 31 May 1989), and WO 89/04669 (published 1 June 1989). Screening of the library was 
with the probes described infra., using the method described in Huynh (1985). The frequencies with which 
positive clones appeared with the respective probes was about 1 in 50.000. 

The Isolation of clone 131 was accomplished using a synthetic probe derived from the sequence of 
clone I2f. The sequence of the probe was; 
35 5 GAA CGT TGC GAT CTG GAA GAC AGG GAC AGG 3'. 

The isolation of clone 26j was accomplished using a probe derived from the s'-region of clone K9-1. 
The sequence of the probe was: 

5' TAT CAG TTA TGC CAA CGG AAG CGG CCC CGA 3'. 

The isolation procedures for clone I2f and for clone k9-l (also called K9-1) are described In EPO Pub. 

40 No. 318,216, and their sequences are shown in Figs. 1 and 2. respectively. The HCV cDNA sequences of 
clones 131 and 26j, are shown in Figs. 4 and 5, respectively. Also shown are the amino acids encoded 
therein, as well as the overlap of clone 13i with clone 12f, and the overlap of clone 26} with clone I3i. The 
sequences for these clones confirmed the sequence of clone K9-1. Clone K9-1 had been isolated from a 
different HCV cONA library (See EP 0.218,316). 

45 Clone CA59a was isolated utilizing a probe based upon the sequence of the 5,-region of clone 26j. The 
sequence of this probe was: 

5 CTG GTT AGC AGG GCT TTT CTA TCA CCA CAA 3'. 

A probe derived from the sequence of clone CA59a was used to isolate clone CA84a. The sequence of 
the probe used for this isolation was: 
50 5' AAG GTC CTG GTA GTG CTG CTG CTA TTT GCC 3'. 

Clone CAl56e was isolated using a probe derived from the sequence of clone CA84a. The sequence of 
the probe was: 

5 ACT GGA CGA CGC AAG GTT GCA ATT GCT CTA 3'. 

Clone CA167b was Isolated using a probe derived from the sequence of clone CA 156e. The sequence 
55 of the probe was: 

5 TTC GAC GTC ACA TCG ATC TGC TTG TCG GGA 3'. 

The nucleotide sequences of the HCV cDNAs in clones CA59a, CA84a. CA156e, and CA167b, are 
shown Figs. 6, 7, 8, and 9, respectively. The amino acids encoded therein, as well as the overlap wfth the 
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sequences of relevant clones, are also shown In the Figs. 



Creation of "pi* HCV cONA Ubrary 

5 

A library of HCV cDNA, the "pi" library, was constructed from the same batch of infectious chimpanzee 
plasma used to construct the tambda-gt1l HCV cONA library (ATCC No. 40394) described in EPO Pub, No. 
318,216. and utilizing essentially the same techniques. However, construction of the pi library utilized a 
io primer-extension method, in which the primer for reverse transcriptase was based on the sequence of clone 
CA59A. The sequence of the primer was: 
5 GGT QAC GTG GGT TTC 3. 



T5 Isolation and Sequence of Clone pi 14a 



Screening of the n pi ,T H CV cDNA library described supra., with the probe used to isolate clone CA167b 
(See supra.) yielded clone pit 4a. The clone contains about 800 base pairs of cDNA which overlaps clones 
20 CAi67b, CAi56e. CA84a and CA59a, which were isolated from the lambda gt-1 1 HCV cDNA library (ATCC 
No. 40394). In addition, pi14a also contains about 250 base pairs of DNA which are upstream of the HCV 
cDNA in clone CA167b. 



25 Isolation and Sequence of Clones CA2l6a, CA290a and ag30a 



Based on the sequence of clone CAl67b a synthetic probe was made having the following sequence: 
5' GGC TTT ACC ACG TCA CCA ATG ATT GCC CTA 3 
so The above probe was used to screen the Iambda-gt11 library (ATCC No. 40394). which yielded clone 
CA216a ? whose HCV sequences are shown in Fig. 10. 

Another probe was made based on the sequence of clone CA216a having the following sequence: 
5' TTT GGG TAA GGT CAT CGA TAC CCT TAC GTG 3' 

Screening the lambda-gt11 library (ATCC No. 40394) with this probe yielded clone CA290a. the HCV 
35 sequences therein being shown In Fig. 11. 

In a parallel approach, a primer-extension cDNA library was made using nucleic acid extracted from the 

same infectious plasma used in the original Iambda-gt11 cONA library described above. The primer used 

was based on the sequence of clones CA2i6a and CA290a: 

5 GAA GCC GCA CGT AAG 3 
40 The cDNA library was made using methods similar to those described previously for libraries used in the 

isolation of clones pi14a and k9-L The probe used to screen this library was based on the sequence of 

clone CA290a: 

5' CCG GCG TAG GTC GCG CAA TTT GGG TAA 3' 

Clone ag30a was isolated from the new library with the above probe, and contained about 670 basepairs of 
45 HCV sequence. See Fig. 12. Part of this sequence overlaps the HCV sequence of clones CA216a and 
CA290a. About 300 base-pairs of the ag30a sequence, however, is upstream of the sequence from clone 
CA290a. The non-overlapping sequence shows a start codon O and stop codons that may indicate the start 
of the HCV ORF. Also Indicated in Fig. 12 are putative small encoded peptides {#) which may play a role in 
regulating translation, as well as the putative first amino acid of the putative polypeptide (/), and downstream 
50 amino acids encoded therein. 

Isolation and Sequence of Clone CA205a 

55 

Clone CA205a was isolated from, the original lambda gt-1 1 library (ATCC No. 40394), using a synthetic 
probe derived from the HCV sequence in clone CA290a ^Rg. 1 1 ). The sequence of the probe was: 
5 TCA GAT CGT T6G TGG AGT TTA CTT GTT GCC 3 . 
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The sequence of the HCV cDNA in CA205a. shown in Hg. 13, overlaps with the cDNA sequences in both 
clones ag30a and CA290a. The overlap of the sequence with that of CA290a is shown by the dotted fine 
above the sequence (the figure also shows the putative amino acids encoded in this fragment). 

As observed from the HCV cDNA sequences in clones CA205a and ag30a, the putative HCV 
5 polyprotein appears to begin at the ATG start codon; the HCV sequences In both clones contain an in- 
frame, contiguous double stop codon (TGATAG) forty two nucleotides upstream from this ATG. The HCV 
ORF appears to begin after these stop codons, and to extend for at least 8907 nucleotides (See the 
composite HCV cONA shown in Fig. 17). 

10 

Isolation and Sequence of Clone 1 8g 



Based on the sequence of clone ag30a (See Fig. 12) and of an overlapping clone from the original 
;s lambda gt-11 library (ATCC No. 40394), CA230a, a synthetic probe was made having the following 
sequence: 

5' CCA TAG TGG TCT GCG GAA CCG GTG AGT ACA 3'. 

Screening of the original lambda-gtn HCV cDNA library with the probe yielded clone 18g. the HCV cDNA 
sequence of which is shown In Fig. 14. Also shown in the figure are the overlap with done ag30a, and 

20 putative polypeptides encoded within the HCV cONA. 

The cONA in clone I8g (Cl8g or I8g) overlaps that in clones ag30a and CA205a, described supra. The 
sequence of Cl8g also contains the double stop codon region observed in done ag3Ga. The polynucleotide 
region upstream of these stop codons presumably represents part of the 5-region of the HCV genome, 
which may contain short ORFs, and which can be confirmed by direct sequenc ing of the purified HCV 

25 genome. These putative small encoded peptides may play a regulatory role in translation. The region of the 
HCV genome upstream of that represented by C18g can be isolated for sequence analysis using essentially 
the technique described in EPO Pub. No. 318.216 for Isolating cDNA sequences upstream of the HCV 
cDNA sequence in clone I2f. Essentially, small synthetic oligonucleotide primers of reverse transcriptase, 
which are based upon the sequence of Cl8g, are synthesized and used to bind to the corresponding 

20 sequence in HCV genomic RNA. The primer sequences are proximal to the known 5'-terminal of C18g, but 
suffidently downstream to allow the design of probe sequences upstream of the primer sequences. Known 
standard methods of priming and cloning ar eused. The resulting cONA libraries are screened with 
sequences upstream of the priming sites (as deduced from the elucidated sequence of Ci8g). The HCV 
genomic RNA is obtained from either plasma or liver samples from individuals with NANBH. Since HCV 

35 appears to be a FlavMike virus, the s'-terminus of the genome may be modified with a "cap" structure. It is 
known that Flavivirus genomes contain 5'-temainal "cap" structures. (Yellow Fever virus, Rice et al. (1988); 
Dengue virus, Hahn et al (1988); Japanese Encephalitis Virus (1987)). 



40 Isolation and Sequence of Clones from the beta- HCV cONA library 

Clones containing cDNA representative of the 3'-terminal region of the HCV genome were Isolated from 
a cDNA library constructed from the original infectious chimpanzee plasma pool which was used for the 

45 creation of the HCV cDNA Iambda-gt11 library (ATCC No. 40394). described in EPO Pub. No. 318.216. In 
order to create the ONA library, RNA extracted from the plasma was "tailed* with poly rA using poly (rA) 
polymerase, and cDNA was synthesized using oligo(dT)i 2 -t8 as a primer tor reverse transcriptase. The 
resulting RNAxDNA hybrid was digested with RNAase H, and converted to double stranded HCV cDNA. 
The resulting HCV cDNA was cloned into lambda-gtiG, using essentially the technique described in Huynh 

so (1985). yielding the beta (or b) HCV cDNA library. The procedures used were as follows. 

An aliquot (12ml) of the plasma was treated with proteinase K. and extracted with an equal volume of 
phenol saturated with 0.05M Tris-CI, pH 7.5, 0.05% (v/v) beta-mercaptoethanol, 0.1% (w/v) hydrox- 
yquinolone. 1 mM EOTA. The resulting aqueous phase was re-extracted with the phenol mixture, followed 
by 3 extractions with a 1:1 mixture containing phenol and chloroform:isoamyl alcohol (24:1), followed by 2 

ss extractions with a mixture of chloroform and isoamyl alcohol (1 :1). Subsequent to adjustment of the aqueous 
phase to 200 mM with respect to NaCI, nucleic acids in the aqueous phase were precipitated overnight at 
-20* C, with 2.5 volumes of cold absolute ethanol. The precipitates were collected by centrifugation at 
10,000 RPM for 40 min., washed with 70% ethanol containing 20 mM NaCK and with 100% cold ethanol, 
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dried tor 5 mln. In a dessicator, and dissolved in water. 

The Isolated nucleic acids from the infectious chimpanzee plasma pool were tailed with poly rA utilizing 
poly-A polymerase in the presence of human placenta ribonudease inhibitor (HPRi) (purchased from 
Amersham Corp.), utilizing MS2 RNA as carrier. Isolated nucleic acids equivalent to that in 2 ml of plasma 

5 were incubated in a solution containing TMN (50 mM Tris HCI, pH 7.9, 10 mM MgCfe, 250 mM NaCI, 2.5 
mM MnCI 2 , 2 mM dithiothreitol (DTT)), 40 micromolar alpha-pP] ATP. 20 units HPRI (Amersharn Corp.), 
and about 9 to 10 units of RNase free poly-A polymerase (BRL). Incubation was for 10 min. at 37* C, and 
the reactions were stopped with EDTA (final concentration about 250 mM). The solution was extracted with 
an equal volume of phenol-chloroform, and with an equal volume of chloroform, and nucleic acids were 

10 precipitated overnight at -20* C with 5.5 volumes of ethanol in the presence of 200 mM NaCI. 



Isolation of Clone b5a 



The beta HCV cONA library was screened by hybridization using a synthetic probe, which had a 
sequence based upon the HCV cDNA sequence in clone I5e. The isolation of clone I5e is described in 
EPO Pub. No- 318,216, and its sequence is shown in Fig. 3. The sequence of the synthetic probe was: 
5 ATT GCG AGA TCT ACG GGG CCT GCT ACT CCA 3'. 

20 Screening of the library yielded clone beta-5a (b5a), which contains an HCV cONA region of approximately 
1000 base pairs. The 5 -region of this cONA overlaps clones 35f, 19g. 26g, and I5e (these clones are 
described supra). The region between the 3'-terminal poly-A sequence and the 3'-sequence which overlaps 
clone I5e, contains approximately 200 base pairs. This clone allows the identification of a region of the 3 - 
terminal sequence the HCV genome. 

25 The sequence of b5a is contained within the sequence of the HCV cDNA in clone 16jh (described infra). 
Moreover, the sequence Is also present In CC34a, isolated from the original lambda-gtti library (ATCC No. 
40394). (The original Iambda-gt1 1 library is referred to herein as the tt C° library). 

30 Isolation and Sequence of Clones Generated by PCR Amplification of the 3'-Region of the HCV Genome 

Multiple cDNA clones have been generated which contain nucleotide sequences derived from the 3'- 
region of the HCV genome. This was accomplished by amplifying a targeted region of the genome by a 

35 polymerase chain reaction technique described in SalW et al. (1986), and In SaiW et al. (1988), which was 
modified as described below. The HCV RNA which was amplified was obtained from the original infectious 
chimpanzee plasma pool which was used for the creation of the HCV cDNA lambda^! 1 library (ATCC No. 
40394) described in EPO Pub. No. 318,216. Isolation of the HCV RNA was as described supra. The isolated 
RNA was tailed at the 3'-end with ATP by E. coli poly-A polymerase as described in Sippel (1973), except 

40 that the nucleic acids isolated from chimp serum were substituted for the nucleic acid substrate. The tailed 
RNA was then reverse transcribed into cDNA by reverse transcriptase, using an oligo dT-primer adapter, 
essentially as described by Han (1987), except that the components and sequence of the primer-adapter 
were: 



Stuffer 


Notl 


SPG Promoter 


Primer 


AATTC 


GCGGCCGC 


CATACGATTTAGGTGACACTATAGAA 


Tis 



The resultant cDNA was subjected to amplification by PCR using two primers: 



Primer 


Sequence 


JH32 (30mer) 
JH11 (20mer) 


ATAGCGGCCGCCCTCGATTGCGAGATCTAC 
AATTCGGGCGGCCGCCATACGA 
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70 



15 



20 



The JH32 primer contained 20 nucieotide sequences hybridizable tc the s'-end of the target region in the 
The JMJ f P"™' _ „ f RR ' r The JH11 was derived from a portion of the oligo dT-primer adapter, 
^7££SE Wd" IT." of 64- C. Bo,h primers were designed to have a 

SgnWon Ihe restriction enzyme. Notl. at the S^-end. tor use in subsequent Coning of the amp.rf.ed 

HCV The N K:R reaction was carried out by suspending the cONA and the primers in 100 microliters of 
*JZ mbdurrSnTaining the four deoxynucleoside triphosphates, buffer saHs and metal ,0ns. and a 
SZZSSw po^erase isolated from Thermus aquaticus (Taq polymerase), which are « a Man 
?T^^lHwIo048 or NSOI-OOSBTrheTCRT^ctTon was performed for 35 cycles In a Perkm 
SZSl om S^ler Each cycle consisted of a 1.5. min denaturation step at 94 C. an 
Elmer Cetus DNA tn^ai cy y at72 C for 3 min. The PCFI products were 

^;rusing P a 30 nucleo.de probe JH34. the sequence of which was based 

upon that of the 3'-terminal region of clone 15e. The sequence of JH34 is: 

h order to done the 

P-'vMo^r cloned between the EcoRI and 
pUCiaS The ^vector puu o ^ JH34 prQbe A numbQr Qf pos|tlve 

Sail sues of ptTCIBJhej ^ l0 ^ s ^ SC T ^ n nuCle0tide gequence of the HCV cDNA insert in one of these 
SZ'Sn l^t-nTSfio^ therein, are' shown in Fig. 15. A nucleotide heterogeneity. 
&TJiil^«5 HCV cDNA in Cone 16jh as compared to another clone of th,s region. . 
indicated In the figure. 

Compiled HCV cDNA Sequences 

An HCV cDNA sequence has been compiled from a series of overlapping clones derived from the 
varied HCvToNA Series described supra.. In this sequence, the ™ ^KTS 
obtTed from clones b114a. 18g. ag30a. CA205a. CA290a. 

rUta k nostream of the compiled HCV cONA sequence published in EPO Pub. No. 318,216, wnicn is 
Sotn in SfTrne cem^HCV cDNA sequence obtained from clones b5a and 16jh downstream of 
fh* rnmniiAri HCVcONA sequence published in EPO Pub. No. 318,216. 

rSrareTtr^^ 

probe was the Synthetic probe used to detect clone 18g. supra. Clone bl14a overlaps wrth clones I8g 
Sf a Td CA2o£ except that clone bi 14a contains an extra two nucleotides upstream of the sequence in 

done Z f£ TRSFS~ «" »° " ucleotWes have been indUded in ^ HCV SeqUenCe 
* Sh T S ho^?d be noted that although several of the clones described supra, have been obtained from 

Hbraries^rL ^origYna. HCV cDNA lambda^ 1 C library (ATCC No. 40394) these clones contain 

h^dna fences which overlap HCV cONA sequences In the original library. Thus, essentmlly all of 

™t£?Z££Z£E* from'the original ****** library (ATCC 
so to isolate the first HCV cDNA clone (5-1-1). The isolation of done 5-1-1 is described in EPO Pub. No. 

318,216. 

Purification of Fusion polypeptide C100-3 (Alternate method) 

55 

The fusion polypeptide. C100-3 (also called HCV dOO-3 and alternatively, C100-3). is comprised of 
superb dTsm^VsOD) at the N-terminus an in-frame C100 HCV polypeptide at th C-.erm.nus. A 
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15 



20 



25 



30 



35 



method *r prepaHng the po.ypep.de byex^ior, J ^ -^^^3^-555 
fraction of the extracted host yeast J^SK^'S method the antigen is precipitated from 

chromatography, and further purified by gel fittr*on jgc ^ (AJCC Nq 2Q879) 

The fusion P*V^™™^ G !^ LSmfcd yeast are grown under conditions which 
transformed with pAB24C100-3 <ATCC_ ^ WWJ. J» * e £p0 P(J 9 b . 318 .216). A cell lysate 

allow expression (i.e. by growth In YEP ontannol % mM pMSF . The 

M,eo ft. pend.otag. « (•*•» "l^rr^tT*.?^ ;^ «i« Bene, A- «* 

combining the westus and homogeoate. ™ b sjspensidn ft Belter 8 (50 mM glycine, 

The material in fte pe«el is washed to arndae aoftt* I "TJ^Vj,,, pH ,o.o. , mM DTT). The 
pH 12J). 1 mM DTT, 500 mM NsCl) Wowed b» 'J^JSni, J, *,„,, C containing SDS. 

£££ SSS.^SSi ? .JSL. - o* ecetor*. . desire*, fte ***** m,v 

^rrssrm^^^ 

r^r;, p .r»^sT^T^ **. eoo«n 9 „, «. ««« 

polypeptide are pooled. polypeptide by gel filtration, the pooled fractions from the Ion- 

In order to purify the HCV c.00-3 WPepro" * !„ rcaDtoetnano | and SDS. and the eluate is 
exchange column are heated in MP««» «MJJ TJ^SS. column previous.y equilibrated 
concentrated by ultrafiltration^ The * "» Ss 0 |> The presence of HCV c10O3 in the eluted 

with Buffer E (20 mM Tris HCI. P H 7.0.1 imM ' D ^ 1 *^™ £, electroph0 resis on polyacrylamide 
fractions, as well as the presence of Impurities are detenron* > oy^ pur . fied ^ 

gels in the presence of SDS and ^^^.J*^^ by repeating the gel filtration 

hcv c100 - 3 containin9 material may * flrtere 

through a 0.22 micron filter. 

Egression and Artigenicity of Polypep^ Encode in HW 



40 



Polypeptides Expressed in E. coli 



< s . • ^ r n t Hrv cDNAs which span the HCV genomic ORF were 

The polypeptides encoded tn a number of "™ "^L^ from a variety of individuals with 

expressed in E. coll. and tested for their a^«c W H^^ Jm constructed from pSOOcff 
NANBH. The-expTession vectors coring t» ctoned HCV cDNAs w ^ ^ ^ 

(Steimer et at. (1986). In order to be certan ^^ZSVtZ^Mmt of three linkers. AB, CD. 
so expression vectors. pcflAB. ^^JZ^^T^^Z vector pSODc.1 with EcoRI and 
anS EF to a BamHI-EcoRI fragment denved ^by d jestng ^P^* e ' ^ 0)igomers> K B> 

BamHI. followed by treatment wrth ^P^^^^h kinase in the presence of ATP prior to 
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Name 



DNA Sequence ( 5 ' to 3 ' ) 



5 



A 



B 



GATC CTG AAT TCC TGA TAA 

GAC TTA AGG ACT ATT TTA A 



70 



C 



GATC CGA ATT CTG TGA TAA 

GCT TAA GAC ACT ATT TTA A 



F 



E 



GATC CTG GAA TTC TGA TAA 

GAC CTT AAG ACT ATT TTA A 



15 



Each of the three linkers destroys the original EcoRI site, and creates a new EcoRI site within the linker, but 
within a different reading frame. Hence, the HCV cONA EcoRI fragments isolated from the clones when 
inserted into the expression vector, were in three different reading frames. 

The HCV cDNA fragments In the designated Iambda-gt1l clones were excised by digestion with EcoRI; 
each fragment was inserted into pcftAB, pcMCD, and pcfiEF. These expression constructs were then 
transformed into D1210 £ coli cells, the transformants were cloned, and recombinant bacteria from each 
clone were induced to express the fusion polypeptides by growing the bacteria in the presence of IPTG. 

Expression products of the indicated HCV cDNAs were tested for antigenicity by direct immunological 
screening of the colonies, using a modification of the method described in Helfman et al. (1983). Briefly, as 
shown in Fig. 18, the bacteria were plated onto nitrocellulose filters overlaid on ampicillin plates to give 
approximately 1,000 colonies per fitter. Colonies were replica plated onto nitrocellulose filters, and the 
replicas were regrown overnight in the presence of 2 mM IPTG and ampicillin. The bacterial colonies were 
lysed by suspending the nitrocellulose filters for about 15 to 20 min in an atmosphere saturated with CHCI 3 
vapor. Each filter then was placed in an incfividual 100 mm Petri dish containing 10 ml of 50 mM Tris HCI. 
pH 7.5. 150 mM NaCI, 5 mM MgCb. 3% (w/v) BSA, 40 micrograms/ml iyso2yme. and 0.1 microgram/m! 
ONase. The plates were agitated gently for at least 8 hours at room temperature. The filters were rinsed in 
TBST (50 mM Tris HCI, pH8.0, 150 mM NaCl 0.005% Tween 20). After incubation, the cell residues were 
rinsed and incubated in TBS (TBST without Tween) containing 10% sheep serum; incubation was for 1 
hour. The filters were then incubated with pretreated sera in TBS from Individuals with NANBH, which 
included: 3 chimpanzees; 8 patients with chronic NANBH whose sera were positive with respect to 
antibodies to HCV C100-3 polypeptide (described in EPO Pub. No. 318,216, and supra.) (also called C100); 
8 patients with chronic NANBH whose sera were negative for anti-ClOO antibodies; a convalescent patient 
whose serum was negative for antl-C100 antibodies; and 6 patients with community acquired NANBH, 
including one whose sera was strongly positive with respect to antl-ClOO antibodies, and one whose sera 
was marginally positive with respect to antl-C100 antibodies. The sera, diluted in TBS, was pretreated by 
preabsorption with hSOO. Incubation of the filters with the sera was for at least two hours. After incubation, 
the filters were washed two times for 30 min with TBST. Labeling of expressed proteins to which antibodies 
in the sera bound was accomplished by incubatton for 2 hours with ,25 Habeled sheep anti-human antibody. 
After washing, the filters were washed twice for 30 min with TBST, dried, and autoradiographed. 

A number of clones (see infra.) expressed polypeptides containing HCV epitopes which were im 
munologically reactive with serum from individuals with NANBH. five of these polypeptides were very 
immunogenic in that antibodies to HCV epitopes in these polypeptides were detected in many different 
patient sera. The clones encoding these polypeptides, and the location of the polypeptide in the putative 
HCV poiyprotein (wherein the amino acid numbers begin with the putative initiator codon) are the following; 
clone 5-1-1, amino acids 1694-1735; clone C100, amino acids 1569-1931; clone 33c, amino acids 1192- 
1457; clone CA279a, amino acids 1-84; and clone CA280a amino adds 9-177. The location of the 
immunogenic polypeptides within the putative HCV poiyprotein are shown immediately below. 
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Clones encoding polypeptides of proven 
reactivity with sera from NANBH patients. 


Clone 


Location within the HCV 
oolvnrotein 


(amino acid no. beginning with 
putative initiator methionine) 


CA279a 


1-84 


CA74a 


437-582 


1 31 


51 1 -690 


CA290a 


9-177 


33c 


1192-1457 


40b 


1266-1428 


5-1-1 


1694-1735 


81 


1689-1805 


33b 


1916-2021 


25c 


1949-2124 


14c 


2054-2223 


8f 


2200-3325 


33f 


2287-2385 


33g 


2348-2464 


39c 


2371-2502 


15e 


2796-2886 


C100 


1569-1931 



The results on the immunogenicity of the polypeptides encoded in the various clones examined 
suggest efficient detection and immunization systems may include panels of HCV polypeptides/epitopes. 



Expression of HCV Epitopes in Yeast 



J5 Three different yeast expression vectors which allow the Insertion of HCV cDNA into three different read 
ing frames are constructed. The construction of one of the vectors. pAB24C100-3 is described in EPO Pub. 
No. 318,216. In the studies below, the HCV cDNA from the clones listed In supra, in the antigenicity 
mapping study using the E. coii expressed products are substituted for the C100 HCV cDNA. The 
construction of the other vectors replaces the adaptor described in the above E, coli studies with one of the 

40 following adaptors: 

Adaptor 1 

45 

ATT TTG AAT TCC TAA TGA G 

AC TTA AGG ATT ACT CAG CT 

50 

Adaptor 2 

AAT TTG GAA TTC TAA TGA G 
as AC CTT AAG ATT ACT CAG CT. 

The inserted HCV cDNA is expressed in yeast transformed with the vectors, using the expression 
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conditions described supra, for the expression of the fusion polypeptide, C100-3. The resulting polypeptides 
are screened using the sera from individuals with NANBH, described supra, for the screening of im- 
munogenic polypeptides encoded in HCV cONAs expressed in E. coli. 



Comparison of the Hydrophobic Profiles of HCV Polyproteins with West Nile Virus Polyprotein and with Dengue 

Virus NS1 



10 The hydrophobicity profile of an HCV polyprotein segment was compared with that of a typical 
Flavivirus. West Nile virus. The polypeptide sequence of the West Nile virus polyprotein was deduced from 
the known polynucleotide sequences encoding the non-structural proteins of that virus. The HCV poly- 
protein sequence was deduced from the sequence of overlapping cDNA clones. The profiles were 
determined using an antigen program which uses a window of 7 amino acid width (the amino acid in 

ts question, and 3 residues on each side) to report the average hydrophobicity about a given amino acid 
residue. The parameters giving the reactive hydrophobicity for each amino acid residue are from Kyte and 
Doollttle (1982). Fig. 19 shows the hydrophobic profiles of the two polyproteins; the areas corresponding to 
the non-structural proteins of West Nile virus, ns1 through ns5, are incBcated in the figure. As seen in the 
figure, there is a general similarity in the profiles of the HCV polyprotein and the West Nile virus 

20 polyprotein. 

The sequence of the amino acids encoded in the s'-reglon of HCV cONA shown in Rg. 16 has been 
compared with the corresponding region of one of the strains of Dengue virus, described supra., with 
respect to the profile of regions of hydrophobicity and hydrophilicity (data not shown). This comparison 
indicated that the polypeptides from HCV and Dengue encoded in this region, which corresponds to the 
25 region encoding NS1 (or a portion thereof), have a similar hydrophobic/hydrophilic profile. 

The similarity in hydrophobicity profiles. In combination with the previously identified homologies in the 
amino acid sequences of HCV and Dengue FlavMrus in EP 0,218,316 suggests that HCV is related to these 
members of the Flavivirus family. 

30 

Characterization of the Putative Polypeptides Encoded Wrthin the HCV ORF 

The sequence of the HCV cDNA sense strand, shown in Rg. 17, was deduced from the overlapping 
HCV cDNAs in the various clones described in EPO Pub. No. 318,216 and those described supra. It may be 

35 deduced from the sequence that the HCV genome contains primarily one long continuous ORF, which 
encodes a polyprotein. In the sequence, nucleotide number 1 corresponds to the first nucleotide of the 
initiator MET codon; minus numbers indicate that the nucleotides are that distance away in the 5 -direction 
(upstream), while positive numbers Indicate that the nucleotides are that distance away in the 3 -direction 
(downstream). The composite sequence shows the "sense" strand of the HCV cDNA. 

40 The amino add sequence of the putative HCV polyprotein deduced from the HCV cDNA sense strand 
sequence is also shown in Rg. 17, where position 1 begins with the putative initiator methionine. 

Possible protein domains of the encoded HCV polyprotein, as well as the approximate boundaries, are 
the following (the polypeptides Identified within the parentheses are those which are encoded in the 
Flavivirus domain): 

4$ 
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Putative Domain 


Approximate 




Boundary 




(amino acid 




nos.) 


"C" (nucleocapsid protein) 


Mao i 


W E M (Virion envelope proteinfs) and possibly matrix (M) proteins 


120-400 


"NSr (complement fixation antigen?) 


400-660 


"NS2* (unknown function) 


660-1050 


"NS3" (protease?) 


1050-1640 


n NS4* (unknown function) 


1640-2000 


"NS5" (polymerase) 


2000-? end 



is 

It should bo noted, however, that hydrophobic^ profiles (described infra), indicate that HCV diverges 
from the Flavlvirus model, particularly with respect to the region upstream of NS2. Moreover, the 
boundaries indicated are not intended to show firm demarcations between the putative polypeptides. 

20 

The Hydrophilic and Antigenic Profile of the Polypeptide 



Profiles of the hydrophilicity/hydrophobicity and the antigenic index of the putative polyprotein encoded 
25 in the HCV cDNA sequence shown in Fig. 16 were determined by computer analysis. The program for 
hydrophilicity/hydrophobicity was as described supra. The antigenic index results from a computer program 
which relies on the following criteria: 1) surface probability, 2) prediction of alpha-he Scity by two different 
methods; 3) prediction of beta-sheet regions by two different methods; 4) prediction of U-turns by two 
different methods; 5) hydrophilicity/hydrophobicity: and flexibility, The traces of the profiles generated by 
oo the computer analyses are shown in Fig. 20. In the hydrophilicfty profile, deflection above the abscissa 
indicates hydrophilicity, and below the abscissa indicates hydrophoblcity. The probability that a polypeptide 
region is antigenic is usually considered to increase when there is a deflection upward from the abscissa in 
the hydrophilic ancVor antigenic profile, ft should be noted, however, that these profiles are not necessarily 
indicators of the strength of the immunogenic^ of a polypeptide. 

as 

Identification of Co-linear Peptides in HCV and Flavi viruses 

The amino acid sequence of the putative polyprotein encoded in the HCV cDNA sense strand was 
40 compared with the known amino acid sequences of several members of Flaviviruses. The comparison 
shows that homology is slight, but due to the regions in which it is found, it is probably significant. The 
conserved colinear regions are shown In Fig. 21. The amino acid numbers listed below the sequences 
represent the number in the putative HCV polyprotein (See Fig. 17.) 

The spacing of these conserved motifs is similar between the Ravi'viruses and HCV, and implies that 
45 there is some similarity between HCV and these flaviviral agents. 

The following listed materials are on deposit under the terms of the Budapest Treaty with the American 
Type Culture Collection (ATCC). 12301 Parklawn Dr., Rockville. Maryland 20852, and have been assigned 
the following Accession Numbers. 

so 
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iamD03-gn i 


A I V^U 


ueposit udie 




NO. 




HCV cDNA horary 


40394 


1 Dec. 1987 


clone 81 


40388 


17 Nov. 1987 


clone 91 


40389 


17 Nov, 1987 


clone 1-2 


40390 


17 Nov. 1987 


clone 5-1-1 


40391 


18 Nov. 1987 


clone 12f 


40514 


10 Nov. 1988 


clone 35f 


40511 


10 Nov. 1988 


clone !5e 


40513 


10 Nov. 1988 


clone K9-1 


40512 


10 Nov. 1988 


JSC 308 


20879 


5 May 1988 


pS356 


67683 


29 April 1988 



In addition, the following deposits were made on 11 May 1989. 



Strain 


Linkers 


ATCC 
No. 


D1 21 0(Cf 1/5-1-1) 


EF 


67967 


D12lO{Cf1/81) 


EF 


67968 


Dl2lO(Cf1/CA74a) 


EF 


67969 


D1210(Cf1/35f) 


AB 


67970 


Dl210(Cf1/279a) 


EF 


67971 


Dl210(Cf1/C36) 


CD 


67972 


Dl210(Cf1/13i) 


AB 


67973 


D1210 (CfVC33b) | 


EF 


67974 


D1210(Cf1/CA290a) 


AB 


67975 


HB101 (AB24/C100#3R) 




67976 



35 The following derivatives of strain 01210 were deposited on 3 May 1989. 



Strain Derivative 


ATCC 




No. 


pCF1CS/C8f 


67956 


PCF1AB/Cl2f 


67952 


PCF1EF/I4c 


67949 


pCFlEF/15e 


67954 


pCF1AB/C25c 




pCF1EF/C33c 


67953 


pCF1EF/C33f 


67050 


pCFlCD/33g 


67951 


pCF1CDfC39c 


67955 


pCF1EF/C40b 


67957 


pCF1EF/CA167b 


67959 



The following strains were deposited on May 12, 1989. 
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Strain 


ATCC 




No. 


Lambda gt11(C35> 


40603 


Lambda gti0(beta-5a) 


40602 


D12lO(C40b) 


67980 


D1210(M16) 


67981 



10 



The deposited materials mentioned herein are intended for convenience only, and are not required to 
practice the present invention in view of the descriptions herein, and in addition these materials are 
incorporated herein by reference. 



15 



Industrial Applicability 



20 



25 



30 



35 



40 



The invention, in the various manifestations disclosed herein, has many industrial uses, some of which 
are the following. The HCV cONAs may be used for the design of probes for the detection of HCV nucleic 
acids in samples. The probes derived from the cDNAs may be used to detect HCV nucleic acids in. for 
example, chemical synthetic reactions. They may also be used In screening programs for anti-vlral agents, 
to determine the effect of the agents in inhibiting viral replication in cell culture systems, and animal model 
systems. The HCV polynucleotide probes are also useful in detecting viral nucleic acids in humans, and 
thus, may serve as a basis for diagnosis of HCV infections in humans. 

In addition to the above, the cDNAs provided herein provide information and a means for synthesizing 
polypeptides containing epitopes of HCV. These polypeptides are useful in detecting antibodies to HCV 
antigens. A series of immunoassays for HCV infection, based on recombinant polypeptides containing HCV 
epitopes are described herein, and will find commercial use in diagnosing HCV induced NANBH, in 
screening blood bank donors for HCV-caused infectious hepatitis, and also for detecting contaminated blood 
from infectious blood donors. The viral antigens will also have utility in monitoring the efficacy of anti-viral 
agents in animal model systems. In addition, the polypeptides derived from the HCV cDNAs disclosed 
herein will have utility as vaccines for treatment of HCV infections. 

The polypeptides derived from the HCV cDNAs, besides the above stated uses, are also useful for 
raising anti-HCV antibodies. Thus, they may be used in anti-HCV vaccines. However, the antibodies 
produced as a result of immunization with the HCV polypeptides are also useful in detecting the presence 
of viral antigens in samples. Thus, they may be used to assay the production of HCV polypeptides in 
chemical systems. The anti-HCV antibodies may also be used to monitor the efficacy of anti- viral agents in 
screening programs where these agents are tested in tfssue culture systems. They may also be used for 
passive immunotherapy, and to diagnose HCV caused NANBH by allowing the detection of viral antigen (s) 
in both blood donors and recipients. Another important use for anti-HCV antibodies is in affinity chromatog- 
raphy for the purification of virus and viral polypeptides. The purified virus and viral polypeptide prepara- 
tfons may be used in vaccines. However, the purified virus may also be useful for the development of cell 
culture systems in which HCV replicates. 

Antisense polynucleotides may be used as inhibitors of viral replication. 

For convenience, the anti-HCV antibodies and HCV polypeptides, whether natural or recombinant, may 
be packaged into kits. 
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Claims 

1 . A recombinant polynucleotide comprising a sequence derived from HCV cDNA. wherein the HCV 
cDNA is in done I3i, or clone 26j, or clone 59a, or clone 84a, or clone CA156e, or done 167b. or clone 
pi14a, or cbne CA216a, or clone CA290a, or clone ag30a, or clone 205a. or clone 18g, or clone 16jh, or 
wherein the HCV cDNA is of a sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in 
Fig- 17. 

2. A recombinant polynucleotide according to claim 1, encoding an epitope of HCV. 

3. A recombinant vector comprising the polynucleotide of claim 1 or claim 2. 

4. A host cell transformed with the vector of claim 3, 
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5. A recombinant expression system comprising an open reading frame (ORF) of DNA derived from the 
recombinant polynucleotide of claim 1 or claim 2, wherein the ORF (s operably (inked to a control sequence 
compatible with a desired host. 

6. A cell transformed with the recombinant expression system of claim 5. 
5 7. A polypeptide produced by the cell of clam 6. 

8. A purified polypeptide comprising an epitope encoded within HCV cDNA wherein the HCV cDNA is 
of a sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17. 

9. An immunogenic polypeptide produced by a celt transformed with a recombinant expression vector 
comprising an ORF of DNA derived from HCV cDNA, wherein the HCV cDNA is comprised of a sequence 

w derived from the HCV cDNA sequence in clone CA279a, or clone CA74a, or done 13i. or clone CA290a f or 
clone 33C or clone 40b, or clone 33b, or clone 25c, or clone 14c, or clone 8f, or clone 33f. or clone 33g, or 
clone 39c. or clone 15e, and wherein the ORF is operably linked to a control sequence compatible with a 
desired host. 

10. A peptide comprising an HCV epitope, wherein the peptide is of the formula 
15 AA x -AA y , 

wherein x and y designate amino acid numbers shown in Fig. 17, and wherein the peptide is selected from 
the group consisting of AA1-AA25, AA1-AA50, AA1-AA84, AA9-AAI77, AA1-AA10, AA5-AA20, M20-AA25, 
AA35-AA45, AA50-AA100, AA40-AA90, AA45-AA65, AA65-AA75, AA80-90, AA99-AA120, AA95-AA110. 
AA105-AA120, AA100-AA150, AA150-AA200. AA155-AA170, AA19OAA210, AA200-AA250, AA220-AA240, 

20 AA245-AA265, AA250-AA300, AA290-AA330, AA290-305, AA300-AA350, AA310-AA330, AA350-AA400, 
AA380-AA395, AA405-AA495, AA400-AA450, AA405-AA415, AA415-AA425. AA425-AA435, AA437-AA582, 
AA450-AA500, AA440-AA460. AA460-AA470, AA475-AA495, AA500-AA550, AA511-AA690, AA515-AA550, 
AA550-AA600. AA550-AA625, AA575-AA605, AA585-AA600, AA600-AA650, AA600-AA625, AA635-AA665. 
AA650-AA700, AA645-AA680, AA700-AA750, AA700-AA725. AA700-AA750, AA725-AA775, AA770-AA790. 

25 AA750-AA800. AA800-AA815, AA825-AA850, AA850-AA875, AA8Q0-AA850, AA920-AA990, AA850-AA900, 
AA920-AA945, AA940-AA965. AA970-AA990, AA950-AA1000. AA1000-AA1 060, AA1000-AA1025, AA1000- 
AA1050, AA1025-AA1040, AA1040-AA1055. AA1075-AA1175, AA1Q50-AA1200. AA1 070-AA1 1 00, AA1100- 
AA1130, AA1140-AA1165, AAJ192-AA1457, AA1195-AA1250, AA1200-AA1225. AA1225-AA1250, AA1250- 
AA1300, AA126O-AA1310. AA1260-AA1280, AA1266-AA1428, AA1300-AA1350. AA1290-AA1310, AA1310- 

30 AA1340, AA1345-AA1405, AA1345-AA1365. AA1350-AA1400. AA1365-AA1380, AA1380-AA1405, AA1400- 
AA1450, AA1450-AAI5OO, AA1460-AA1475. AA1475-AA1515. AA1475-AA1500, AA1500-AA1550. AA1500- 
AA1515, AA1515-AA1550, AA 1 550- AA 1600, AA1545-AA1560. AA1569-AA1931, AA1570-AA1590, AA1595- 
AA1610, AA1590-AA1650, AA1610-AA1645, AA1 650- AA 1690, AA1685-AA1770, AA1689-AA1805. AA1690- 
AA1720, AA1694-AA1735, AA1720-AA1745, AA1745-AA1770. AA1750-AA1800. AA1775-AA1810, AA1795- 

35 AA1850, AA185OAA1900, AA1 900- AA 1950. AA1900-AA1920, AA1916-AA2021, AA1920-AA1940. AA1949- 
AA2124, AA1950-AA2Q00, AA1950-AA1985, M1980-AA200Q, AA2000-AA2050, AA2005-AA2025, AA2020- 
AA2045, AA2045-AA2100, AA2045-AA2070. AA2054-AA2223, AA2070-AA2100, AA2100-AA2150. AA2150- 
AA2200. AA2200-AA2250, AA2200-AA2325. AA2250-AA2330, AA2255-AA2270, AA2265-AA2280. AA2280- 
AA2290, AA2287-AA2385, AA2300-AA2350. AA229OAA2310, AA2310-AA2330, AA2330-AA2350, AA2350- 

40 AA2400, AA2348-AA2464, AA2345-AA2415, AA2345-AA2375, AA2370-AA2410. AA2371 -AA2502. AA2400- 
AA2450, AA2400-AA2425. AA241 5- AA2450, AA2445-AA2500, AA2445-AA2475, AA2470-AA2490, AA2500- 
AA2550, AA2505-AA2540, AA2535-AA2560, AA2550-AA2600, AA2560-AA2580. AA2600-AA2650, AA2605- 
AA2620, AA2620-AA2650. AA264O-AA2660, AA2650-AA2700, AA2655-AA2670, AA2670-AA2700, AA2700- 
AA2750, AA2740-AA2760, AA2750-AA28Q0, AA2755-AA2780, AA2780-AA2830, AA2785-AA281 0, AA2796- 

*S AA2886, AA2810-AA2825, AA2800-AA2850, AA2850-AA2900, AA2850-AA2865, AA2885-AA2905. AA2900- 
AA2950. AA291O-AA2930, AA2925-AA2950. AA2945-end(C' terminal). 

11. A polypeptide comprised of the peptide of claim 10. 

12. An immunogenic polypeptide attached to a solid substrate, wherein the polypeptide is according to 
claim 7, or claim 8, or claim 9. or claim 10, or claim 11, or wherein the polypeptide is comprised of an 

so epitope encoded within HCV cDNA wherein the HCV cDNA is of a sequence indicated by nucleotide 
numbers -319 to 1348 or 8659 to 8866 in Fig. 17. 

13. A monoclonal antibody directed against an epitope encoded in HCV cDNA. wherein the HCV cDNA 
is of a sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17, or is the 
sequence present in clone 13i, or clone 26j, or clone 59a, or clone 84a, or done CA156e, or clone 167b, or 

55 clone pil4a. or clone CA216a, or clone CA280a, or clone ag30a, or clone 205a, or clone I8g, or clone 16ih. 

14. A preparation of purified polyclonal antibodies directed against a polypeptide comprised of an 
epitope encoded withfn HCV cDNA, wherein the HCV cDNA is of a sequence indicated by nucleotide 
numbers -319 to 1348 or 8659 to 8866 in Fig. 17, or is the sequence present in clone 13i, or clone 26j, or 
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20 



25 



clone 59a, or clone 84a, or clone CA156e, or clone 167b, or clone pil4a, or clone CA216a. or clone 
CA290a or clone aq30a, or clone 205a. or clone 18g, or clone 16jh. 

Xucleot.de probe for HCV. wherein the prabe is comprised of an HCV sequence denved rem 
an HCV cK sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 In Rg. 17. or from 

* the comolement of the HCV cONA sequence. 

St Tor analyzing samples for the presence of polynucleotides from comprising a poly- 
nucleotide probe contamin| a nucleotide sequence of about 8 or more nucleotides, wherein the ^nucleotide 
Su^e is dSved from HCV cONA which is of a sequence indicated by nucleotide numbers -319 to 1348 
or 8659 to 8866 In Fig. 1 7. wherein the polynucleotide probe Is in a suitable container. 

,o n. A Stor anaUg samples for the presence of an HCV antigen comprls.ng an ******* 
^immunologic^ with an HCV antigen, wherein the 

cDNA which Is of a sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 I .Fig. 17 or 
w^ereiMhe HCV cDNA is in clone 13i. or clone 26j. or clone 59a. or clone 84a. or done CA156e. or clone 
^S? or c^ne pi14a. or ctone CA216a. or clone CA290a. or clone ag30a. or clone 205a. or clone 18g, or 
, 5 clone 16jh. fof presence Qf gn HCV an(jbody comprising an antigenic 

Dolvoeptide contelning an HCV epitope encoded within HCV cDNA which is of a sequence ind.cahaJ by 
SCnumbTr -319 to 1348 or 8659 to 8866 In Rg. 17. or is in clone 13i. or clone 26 ^or done 59a or 
or clone CAl56e. or ctone 167b, or ctone pi14a, or clone CA216a. or clone CA290a. or clone 

^TTiK^^Wp—" of an HCV antibody comprising an antigen.c 
polyoeptidelpressed from HCV cDNA in clone CA279a. or clone CA74a. or clone 13.. or clone CA290a. or 
SfScVSn. 40b. or clone 33b. or done 25c, or clone 14c. or clone 8f, or clone 33f. or clone 33g, or 
clone 39c or clone I5e, wherein the antigenic polypeptide is present in a suitable container. 
20 A method for detecting HCV nucleic acids in a sample comprising: 
(a) Sng nucleic acTds of the sample with a polynucleotide probe for HCV. wherem the probe is 

comprised Z an HCV sequence derived from an HCV cDNA sequence Is of a sequence ind,cated by 
compnsea or an «h where(n ^ f& ^ ng |S under 

^7JZTJ£Z l^uSoe duplex beLen the probe and the HCV nucleic acid from the 

30 sam P le ^ detectjng a po^nudeotide duplex which contains the probe, formed in step (a). 
21. An immunoassay for detecting an HCV antigen comprising: 
(a) incubating a sample suspected of containing an HCV antigen with an antibody <^ ^ 
Stone encoded in HCV cDNA. wherein the HCV cDNA is of a sequence indicated by nudeot.de numbers 
fame 1M or sSU to 8866 in Rg. 17. or is the sequence present in done 131. or clone 26, or clone 59a. 
or done X or done CA,56e. or clone 167b. or done pi14a. or clone CA216a. or ctone CA290a. or clone 
ao30a c^one 205a or ctone 18g. or done I6 jh , and wherein the incubating Is under conditions which 
?tow formaiion of an antigen-antibody complex; and (b) detecting an antibody-antigen complex formed ,n 
step (a) which contains the antibody. 

22 An immunoassay for deteding antibodies directed against an HCV antigen comprising, 
(a) inXlng a sample suspected of containing anti-HCV antlbod.es with an ^en polypeptide 

containing Z epitope encoded in HCV cDNA, wherein the HCV cDNA is of a sequence indicated by 
Z?«H» -319 to 1348 or 8659 to 8866 In Fig. 17. or is the sequence present In clone 131. or 

'S^^*£*<^&™<£* CA156e, I clone ,67b. or ctone p.14a. or done CA216a. or 
« clone cXa or ctone ag30a. or clone 205a, or clone 18g, or clone 16jh. and wherein the incubating ,s 
under conditions which alfow formation of an antigen-antibody complex; and , „ 

S detecting an antibody-antigen complex formed in step (a) which contains the antigen polypeptide. 

23 An Immunoassay for detecting antibodies directed against an HCV antigen compns.ng: 
(aYlnciteting a sample suspected of containing anti-HCV antibodies with the polypeptide of claim 9. 

under conditions which allow formation of an antigen-antibody complex: and 

S deteSng an antibody-antigen complex formed in step (a) which contains tine antigen polypeptide. 

24 A^SSiS tieatment of HCV infection comprising an immunogenic polypeptide contaming an 
H CV epfto rencoded in HCV cDNA wherein the HCV cDNA is of a sequence indic^d I by nucfeo de 
numbers -319 to 1348 or 8659 to 8868 in Rg. 17 or is the sequence present in don 13. or clone 26 , or 

S5 done 59a or 84a. or clone CA156e. or ctone 167b. or clone pi14a. or clone CA216a. or clone 
CA^Oa or clone ag30a or ctone 205a. or ctone I8g. or ctone 16jh. and where.n me .mmunogemc 
^vSotid^ is present in a pharmacologically effective dose in a pharmaceutical^ acceptable exc.p.ent. 
1 *£Tt2tt%^**^* HCV comprising administering to an individual an Isolated 
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im munogenic polypeptide oon* Inlng a, , HCV 

a sequence «ntfc*rfby nudrtd. JjJ" ^ CA290 , or 2one 33C or done 40b. or 

T%S Ji^or c^l^i* or clone 331. or done 33g. or clone 39c. or done 15e. 
T^^^c^6e Is present in a pharmacy effedive dose ,n a pharmaceu- 

^'KtnSeie ^nucleotide derived from HCV cONA. wherein the HCV cDNA is that shown in Rg. 
1? " 27. A method for preparing purified fusion polypeptide •CttHMfpUV 



precipitate. 



»5 



20 



lei isolating and solubiliang the precipitated material, 
d isolating the C100-3 po.ypeptide by anion «^ «*72E^' 
to further isolating the C100-3 polypeptide of step (d) by gel filtration. 

28. A method for preparing an ^2^.32^ expression system comprising an open 
(a) providing a host cell ^^^cWK^^b HCV cDNA is in clone 131. or clone 26j. 

DMA derived from HCV cC >NA _whena Hto HCV a j ^ ^ ^ 33c Qf c|Qne 

numbers -319 to 1348 or 8659 to 8866 in Rg. 17 comprising: 
(a) providing a host cell capable of transformation; 

saww-s- *. — - - - - - •» 

polynucleotide recombinant polynucleotide comprised of a sequenca of HCV cDNA 

31. A method for preparing a recomotnam WW"™™ *■ , CA156e. or 

derived from the HCV cDNA £ ^V-r^ttS^«£S X- Cone 205a. or clone 18g. 
tZ^TZ^^m^: Cence indicated by nuc-eofide numbers -319 to 1348 or 

8659 a^S'Sormed wilh the recombinant po.ynudeot.de; and 

(b) isolating said polynucleotide from said host cell. 

the formation of anttbody-HCV polypeptide complexes; 
(d) detecting the complexes formed In step (c); and 
e) saving the blood from which complexes were not detected in (d). 
33. A method for preparing blood ^J^JT^^ of M ng HCV polynucleotides; 
(a) providing nucleic -^J^J^S^^fSSK?-. HW sequence derived from an 
HCV £XS:E£E A= ?y 6 Si* numbers -319 » ,348 or 8659 to 8866 in Rg. 
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(c) reacting (a) with (b) under conditions which allow the formation of a polynucleotide duplex 
between the probe and the HCV nucleic acid from the sample; 

(d) detecting a polynucleotide which contains the probe, formed in step (c); and 

(e) saving the blood from which complexes were not detected in (d). 

5 34. A method for producing a hybridoma which produces anti-HCV monoclonal antibodies comprising: 

(a) immunizing an individual with an immunogenic polypeptide containing an epitope encoded in HCV 
cDNA. wherein the HCV cONA is HCV cDNA In clone t3i, or clone 26], or clone 59a, or clone 84a, or clone 
CA156e, or clone 167b, or done pi14a, or clone CA216a, or clone CA290a, or clone ag30a, or clone 205a, 
or clone 18g, or clone 16jh, or wherein the HCV cDNA is of a sequence indicated by nucleotide numbers 

w -319 to 1 348 or 8659 to 8866 in Fig. 1 7; or 

(b) immunizing an individual with an immunogenic polypeptide prepared according to claim 29; 

(c) Immortalizing antibody producing cells from the immunized individual; 

(d) selecting an immortal cell which produces antibodies which react with an HCV epitope in the 
immunogenic polypeptide of (a) or (b); and 

15 (e) growing said immortal cell. 
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Translation of DNA ' 12f 

IlePheLysIleArgMetTyrValGlyGlyValGluiasArgLeuGluAlaAlaCysAsn 
1 CCATATTTAAAATCAGGATGTACGTGOCAGGGGTCGAACACACGCTGGXAGCTGCCTGCA 
GGTATAAA7TTTAGTCCTACATGCACCCTCCCCAGCTTGTGTCCGACCTTCGACGGACGT 

TrpThrArsGlyGluArgCysAspLeuGluAspAxgAspAx^ 
6 1 ACTGGACGCGGGGCGAACGTTGCGATCTGGAAGACAGGGACAGGTCCGAGCTCAGCCCGT 
TGACCTGCGCCCCGCTTCCAACGCTAGACCTTCTGTCCCTGTCCAGCCTCGXGTCGGGCA 

LeuLcuThrT?irthrGlnTrpGlnValI^uProCysS«rPh€ThrThxLeuProAlaLeu 
121 TACTGCTGACCACTACACAGTGGCAGGTCCTCCCGTGTTCCTTCACAACCCTACCAGCCT 
ATGACGACTGGTGATGTGTCACCGTCCAGGAGGGCACAAGGAAGTGTTGGGATGGTCGGA 

SerThrGlyLeuIleHisLcuHisGlnAsnlUValAspValGlnTyrLcuTyrGlyVai 
181 TGTCCACCGGCCTCATCCACCTCCACCAGAACAT7GTGGACGTGCAGTACT7G7ACGGGG 
ACAGGTGGCCGGAGTAGGTGGAGGTGGTCTTGTAACACCTGCACGTCATGAACATGCCCC 



GlySerSertleAlaSerTrpAlalleLyaTrpGluTyrValValLeuLeuPheLeuLeu 
241 TGGGGTCAAGCATCGCGTCCTGGGCCATTAAGTGGGAGTACGTCGTTCTCCTGTTCCTTC 
ACCCCAGTTCGTAGCGCAGGACCCGGTAATTCACCCTCATGCAGCAAGAGGACAAGGAAG 

LeuAlaAapAlaAr9ValCysSerCysLeuTrpMetll«tL«uI^uXl«SerGlnAlaGlu 
301 * TGCTTGCAGACGCGCGCGTCTGCTCCTGCTTGTGGATGATGCTACTCATATCCCAAGCGG 
ACGAACGTCTGCGCGCGCAGACGAGGACGAACACCTACTACCATGAGTATAGGGTTCGCC 

— Overlap with 14 i 

AlaAlaLeuGluAsnLeaValH«LeuAsnAlaAlaS«rLeuAlaGlythrHisGlyL«u 
361 AGGCGGCTTTGGAGAACCTCGTAATACTTAATGCAGCATCCCTGGCCGGGACGCACGGTC 
" TCCCCCGAAACCTCTTGGAGCATTATGAATTACGtCGTAGGGACCCGCCCTGCGTGCCAG 

val 

4 21 TTGTATC 
AACATAG 
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Translation of OHA 

GlyCvsProGluArgLeuAlaSerCysArgProLeuthrAspPheAspGlnGiyTrpGly 
1 CAGGCTGTCCTGAGAGCCTAGCCAGCTGCCGACCCCTTACCGATTTTGACCAGGGCTGGG 
GTCCGACAGGACTCTCCGATCGG1CGACGGCTGGGGAATGGCTAAAACTGGTCCCGACCC 

proIleSerTyrAlaA3nGlyS«rGlyProAipGlnArgProtyrCy$TrpHisTyrPro 
6 1 GCCCTATCAGTTATGCCAACGGAAGCGGCCCCGACCAGCGCCCCTACTGCTGGCACTACC 
CGGGATAGTCAATACGGTTGCCTTCGCCGCGGCTGGTCGCGGGGATGACGACCGTGATGG 

ProLyiProCy«GlyIl«ValPraAlaLysSerValCy«GlyProVAlTyrCyaPh*Thr 
121 CCCCAAAACCTTGCGGTATTGTGCCCGCGAAGAGTGTGTGTGGTCCGGTATATTGCTTCA 
GGGGTTTTGGAACGCCATAACACGGGCGCTTCTCACACACACCAGGCCATATAACGAAGT 

ProSerProValValValGlyThxThxAspArgSerGlyAl»ProThrTyrSerTrpGly 
1B1 CTCCCAGCCCCGTGGTCGTGGGAACGACCGACAGGTCGGGCGCGCCCACCTACAGCTGGG 
GAGGGTCGGGGCACCACCACCCTTGCTGGCTGTCCAGCCCGCGCGGGTGGATGTCGACCC 

GluAfnA»pThrAipValPhcValL«uAitUisnThrArgProProLeuGlyAjnTrpPhc 
241 GTGAAAATGATACGGACGTCTTCGTCCTTAACAATACCAGGCCACCGCTGGGCAATTGGT 
CACTTTTACTATGCCTGCAGAAGCAGGAATTGTTATGGTCCGGTGGCGACCCGTTAACCA 

GlyCyiThrTrpHetAJtiSerThrGlyPheThrLysValCysGlyAlaProProCysVat 
3 01 TCGGTTGTACCTGGATGAACTCAACTGGATTCACCAAAGTGTGCGGAGCGCCTCCTTGTG 
AGCCAACATGGACCTACTTGAGTTGACCTAAGTGGrtTCACACGCCTCGCGGAGGAACAC 

lleGlyGlyAlaGlyAsnAanThrLtuHlsCytProThrAspCysPh^ArgLysHisPro 
361 TCATCGGAGGGGCGGGCAACAACACCCTGCACTGCCCCACTGATTGCTTCCGCAAGCATC 
AGTACCCTCCCCGCCCGTTGTTGTGGGACGTGACGGGGTGACTAACGAAGGCGTTCGTAG 

AapAlaThrTyrSerA^CysGlyS«rGlyProTrpIleThrPrQArgCysLeuValAsp 
421 CGGACGCCACATACTCTCGGTGCGGCTCCGGTCCCTGGATCACACCCAGGTGCCTGGTCG 
GCCTGCGGTGTATGAGAGCCACGCCGAGGCCAGGGACCTAGTG TGG GTCCACGG ACCAGC 

T7rProtytAxgUuTrpHijTyrP»CyiThrIl«AstiTyrThrIl«PhaLysIl«Arg 

481 ACTACCCGTATAGGCTTTGGCATTATCCTTGTACCATCAACTACACTATATTTAAAATCA 
. . TCATGGGCATATCCGAAACCGTAATAGGAACATGGTAGTTGAtGTGATATAAATTTTAGT 



M«tTyrValGlyGlyV«lGluHl«ArgL«uGluAlaAlaCyjAsnTrpThrArgGlyGlu 
541 GGATGTACGTGGGAGGGGTCGAGCACAGGCTGGAAGCTGCCTGCAACTGGACGCGGGGCG 
CC2ACATGCACCCTCCCCAGCTCGTGTCCGACCTTCGACCGACGTTGACCTGCGCCCCGC 



ArgCy f AjpUuGluAipAr gAspArgS arGluI^uS erProt^uL«uLeuThrThrThr 
601 AACGTTGCGATCTGGAAGATAGGGACAGGTCCGAGCTCAGCCCGTTACTGCTGACCACTA 
TTGCAACGCTAGACCTTCTATCCCTGTCCAGGCTCGAGTCGGGCAATGACGACTGGTGAX 



GlnTrpGlnValI^uPrt^yiSirPhiThrThrl^uProAlalAuS«rThrGlyLeuIle 

661 CACAGTGGC*GGTCCTCCCGTGTTCCTTCACAAa^ 

GTGTCACCGTCCAGGAGGGCACAAGGAAGTGTTGGGACGGTCGGAACAGGTGGCCGGAGT 

•Overlap vlth Combined ORF of DNAj ttoouch 15e~-~* 

HisLeuHisGlnAsnlleValAspValGlntyrl^uTyxGlyValGlySerSerll^a 

7 21 TCCACCTCCACCAGAACAT?CTGCA<*TGCAGTACTO^ 

AGGTGGAGGTGGTCTTGTAACACCTGCACGTCATGAACATGCCCCACCCCAGTTCGTAGC 



SarTrpAlalleLyiTrpGluTyrValYalX^uI^uPhelAuI^i^uAlaAapAia^g 
7 6 1 CGTCCTGGGCCATTAAGTGGGAGTACGTCGTCCTCCTGTTCCTtCTGCTTGWG^CCGC 
CCAGGACCCGGTAATTCACCCTCAtGCAGCAGGAGGACAAGGAAGACGAACGTCTGCGCG 
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valCysSerCysLeuTrpMfttM«ti«utturieS«rGlaU»GluAiaAlaL«\ifiluAin 
841 GCGTCTGCTCCTGCTTGTGGATGATGCTACTCATATCCCAAGCGGAAGCGGCTTTGGAGA 
CGCAGACGAGGACGAACACCTACXACGATGAGTATAGGGTrCGCCrTCGCCGAAACCTCT 

UuValllelAUAsnAlAAlaStrUuAlaGlyTta^ 
901 ACCTCGTAATACTTAATGCAGCATCCCTGGCCCGGACGCACGGTCTTGtATCCTTCCTCG 
TGGAGCAtTATGAATTACGTCGTAGGGACCCkKCCTGCGTGCCWAACATAC(^AGGAGC 

~PhePheCyjPh^aTrpTyrL«uLysClyLyrt^ 
961 TGTTCTTCTGCTTTGCATGGTATCTGAAGCGiAAGTGGGTGCCCGGAGCGGTCTA&CCT 
ACMGAAGACGAAACGTACCATAGACTTCCCATTCMCCACGGGCCXCGCCAWJCTQGA 

TyrGlyMetTz^ProLauLcuI^ur^uLevil^uAl*l*uProGlnAr^AlATyrAl«l^u 
io 2 1 tctaccggatgtcgcctctcctcctgctcctctt^^ 

AGAtGCCCTACACCGGAGAGGAGGACGAGCACAACCGCAACGGGGTCGCCCGCATGCGCG 

A5pThjGluValAlaAlaSerCysGlyGlyValValL«uValGlyLeuKetAlaL«uThr 

* 0 8 1 TGG ACACGGAGGTGGCCGCGTCGTGTGGCGGTGTTGTTCTCGTCGGGTTG ATGGCGCTAA 

ACCTGTGCCTCCACCGGCGCAGCACACCGCCACAACAAGAGCAGCCCAACTACCGCGATT 

LauS«rProTyrTyrLyjArgTyrIltStrTrpCy*LauTrpTrpLtuGlnTyxPh*Lea 

* 14 1 CTCTGTCACCATATTACAAGCG CTATATCAGCTGGTGCTTGTGGTGG CTTCAGTATTTTC 

GAGACAGTGGTAXAATGTTCGCGATATAGTCGACCACGAACACCACCGAAG7CATAAAAG 

*"ThrArgValGluAlaGlnLeuHisValTrpIl€ProProl^uA5nValArgGlyGlyArg 
1201 TGACCAGAGTGGAAGCGCAACTGCACGTGTGGATTCCCCCCCTCAACGTCCGAGGGGGGC 
ACTGGTCTCACCrrCCCGTTGACGTGCACACCTAAGGGGCGCAGTTGCAGGCTCCCCCCG 

~~spAlaValIleLeuL«uMet^ 

1261 " gcgacgctgtcatcttactcatctgtgcrctacaccccactctggtatttgacatcacca 
cgctgcgaca^agS 

"Ze^euLeuAlaValPheGlyProL;^ 
1321 AATTGCTGCTGCCCGTCTTCGGACCCCTMWAT^^ 

TTAAC G ACGACCGGCAG AAG CCTGGGG AAACCTAAG AAGTTCGGTC 
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Translation of DHA 15 c 



Gl yxl aGlyLy f XrgvalTyrTyrLcuThrxrgA jpProThrThr P roLeuAl aArgAla 
1''-CCGCGCTO1AAAGXGGGTCTXCTA.CCTCACCCGTGACCCTXCAACCCCCCTCGCGAGAGC 
GCCGCGACCTTTCTCCCAGATGATGGAGTGGGCACTGGGATGTTGGGGGGAGCGCTCTCG 

AlaTrpCluThxAleJ^^isThrP^ 
6 1 TCCGTGGGAGACAGCAAGACACACTCCAGTCAATTCCTGGCTAGGCAACATAATCATGTT 
ACGCACCCTCTCTCGTTCTGTGTGAGGTCAGTIAAGGACCGATCCGTTGTATTAGTACAA 



AlaProThxLeuTrpAlaAr?MetIleLeuMetThrHlsPh%PheSerValLeuIleAla 
121 TGCCCCCACACTGTGGGCGAGG ATG ATACTCATGACCCATTTCTTTAGCGTCCTTATAGC 
ACGGGGGTGTGACACCCGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAATATCC 

ArgAspGlnLeuGluGlnAlaLeuAs?CysGluIleTyrGlyAlaCysTyrS«rIleGlu 
131 CAGGGACCAGCTTGAACAGGCCCTCGATTCCGAGATCTACGGGGCCTGCTACTCCATAGA 
G TCCCTGGTCGAACTTGTCCGGG AG€T AACGCTCTAGATGCCCCGG ACG ATG AG G * ATC7 

ProLeuAspLeuProProIlelleGlnAr^Leu 
2 4 1 ACC ACTTC ATCTACCTCCAATCATTCAAAG ACTC 
TGGTGAACTAGATGGAGGTTAGTAAGTTTCTGAG 
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Translation of DNA 131 

ProSerProValValvalGlyThrThrAapArgSerClyAlaProThxTyrSerTrpGly 
1 CTCCCAGCCCCGTGGTGGTGCGAACGACCGACAGCTCGGGCGCGCCTACCTACAGCTGGG 
GAGGGTCGGGGCACCACCACCCTTGCTGGCTGTCCAGCCCGCGCGGATGGATGTCGACCC 

Gl uA snAspThx AspVa 1 P heValLeuAsnAa nThr Arg Prop roL^uGlyAsnTr pP he 
6 1 GTG AAAATG ATACGGACGTCTTCGTCCTTAACAATACCAGGCCACCGCTGGGCAATTGGT 
CACTTTTACTATGCCTGCACAAGCAGGAATTGTTATGGTCCGGTGGCGACCCGTTAACCA 

GlyCysThrTrpMetAsnScrThrtlyPheThrLysValCyiGlyAlaProProCyjVal 
121 TCGGTTGTACCTGG ATGAACTCAACTGGATTCACCAAAGTGTGCGGAGCGCCTCCTTGTG 
AGCCAACATGGACCTACTTGAGTTCACCTAAGTGGTXTCACACGCCTCGCGGAGGAACAC 

IlcGlyGlyAlaGlyAsnAsnThrLeuHisCysProThrAapCysPheArgLysHisPro 
181 TCATCGGAGGGGCGGGCAACAACACCCTGCACTGCCCCACTGATTGCTTCCGCAAGCATC 
AGTAGCCTCCCCGCCCGTTGTTGTGGGACGTGACGGGGTGACTAACGAAGSCGTTCGTAG 

AspAlaThrTyrSerArgCysGlySerGlyProTrpLeuThrProArgCysLeuValAsp 
241 CGGACGCCACATACTCTCGGTGCGGCTCCGGTCCCTGGCTCACACCCAGGTGCCTGGTCG 
GCCTGCGGTGTATGAGAGCCACGCCGAGGCCAGGGACCGAGTGTGGGTCCACGGACCAGC 

TyrProTyrArgLeuTrpHisTyrProCysThrllcAsnTyrthrllePheLysIleArg 

3 0 1 ACTACCCGTATAGGCTTTGGCATTATCCTTGTACCATC AACTACACCATATTTAAAATCA 

TGATGGGCATATCCGAAACCGTAATAGGAACATGGTAGTTGATGTGGTATAAATTTTAGT 

MetTyrValGlyGlyValGluHisAxgLeuGluAlaAlaCysAsnTrpThrArgGlyGlu 
361 GGATGTACGTGGGAGGGGTCGAGCACAGGCTGGAAGCTGCCTGCAACTGGACGCGGGGCG 
CCTACATGCACCCTCCCCAGCTCGTGTCCGACCTTCGACGGACGTTGACCTGCGCCCCGC 

— — - — — —Overlap with 12 f--— — — — — — ~ — — — 

ArgCysAspLeuGluAspArgAspArgSerGluLeuSerProLeuLeuLeuThrThrThr 

4 21 AACGTTGCGATCTGGAAGACAGGGACAGGTCCGAGCTCAGCCCGTTACTGCTGACCACTA 

• TTGCAACGCTAGACCTTCTGTCCCTGTCCAGGCTCGAGTCGGGCAATGACGACTGGTGAT 

GlnTrpGlnVallJeuProCysSerPheThrThrLeuProAlaLeuSerThrGlyLeu 
4 S 1 CACAGTGGCAGGTCCTCCCGTGTTCCTTCACAACCCTGCCAGCCTTGTCCACCGGCCTCA 
GTGTCACCGTCCAGGAGGGCACAAGGAAGTGTTGGGACGGTCGGAACAGGTGGCCGGAGT 



FIGUHZ 4 



EP00g388232ihM^ 



Page 46 of 11 



EP 0 388 232 A1 



Translation of DNA 26 j 

LcuPheTyrKisHisLyaPheAsnSerSerGlyCysProGluArgLeuAlaSerCysArg 
1 GCTTTTC TATCACCACAAGTTCAACTCTTCAGGCTGTCCTGAGAGGCTAGCCAG CTGCCG 
CGAAAACATAGTGGTGTrCAAGTTGAGAAGTCCCACACGACTCTCCGATCGGTCGACGGC 

ProLeuThrAspPheAspGlnGlyTrpGlyProIl«SerTyrAlaA*nGlySerGlyPro 
61 ACCCCTTACCGATTTWACCAGGCCTGGGGCCCTATCAGTTATGCCAACGGAAGCGGCCC 
TGGGGAATGGCTAAAACTGGTCCCGACCCCGGGATAGTCAATACGGTTGCCT7CGCCGGG 

AipGlnArgProTyrCyaTrpHisTyrProProLyjProCysGlylleValProAlaLys 
121 CGACCAGCCCCCCTACTGCTGGCACTACCCCCCAAAACCTTGCGGTATTGTGCCCGCGAA 
GCTGGTCGCGGGGATGACGACCGTGATGGGGGGTTTTGGAACGCCATAACACCGGCGCTT 

Overlap with 13i— 

ServalCysGlyProvalTyrCyaPheThrProSerProValvalVal 
181 GAGTGTGTGTGGTCCGGTATATTGCTTCACTCCCAGCCCCGTGGTGGTGGG 
CTCACACACACCAGGCCATATAACGAAGTGAGGGTCGGGGCACCACCACCC 
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Translation of dna CA59a 

LcuValMetAlaGlnLeuLeuArgllcProGlnAlalleLeuAapMctlleAlaGlyAla 
• 1 TTGGTAA7GGCTCAGCTGCTCCGGATCCCACAAGCCATCTTGGACATGATCGCTGGTGCT 
AACCAT7ACCGAGTCGACGAGGCC7AGGGTGTTCGGTAGAACCTG7ACTAGCGACCACGA. 

HisTrpGlyValLcuAlaGlyllcAlaTyrPhcScrMetValGlyAsnTrpAlaLysVal 
6 1 CACTGGGGAGTCCTGGCGGGCATAGCGTATTTCTCCATGGTGGGGAACTGGGCGAAGGTC 
GTGACCCCTCAGGACCGCCCGTATCGCATAAAGAGGTACCACCCCTTGACCCGCTTCCAG 

LeuValValLeuLeuLeuPhcAlaGlyValAspAlaGluThrHisValThrGlyGlySer 
121 CTGGTAG TGCTGCTGCTATTTGCCGGCGTCGACGCGGAAACCCACGTCACCGGGGGAAG T 
GACCATCACGACGACGATAAACGGCCGCAGCTGCGCCTTTGGGTGCAGTGGCCCCCTTCA 

AlaGlyHisThrValSerGlyPheValSerLcuLeuAlaProGlyAlaLysGlnAsnVal 
181 GCCGGCCACACTGTGTCTGGATTTGTTAGCCTCCTCGCACCAGGCGCCAAGCAG AACGTC 
CGGCCGG7GTGACACAGACCTAAACAATCGGAGGAGCG7GGTCCGCGGTTCG7CTTGCAG 

GlnLeuIleAsnThrAsnGlySerTrpKisLeuAsnSerThrAlaLeuAsnCysAsnAsp 
241 CAGCTGATCAACACCAACGGCAGTTGGCACCTCAATAGCACGGCCCTG AACTGCAATGAT 
GTCGAC TAGTTGTGGTTGCCGTCAACCGTGGAG TTATCG TGCCGGGAC TTGACGTTACTA 

ScrLeuAsnThrGlyTrpLcuAlaGlylreuPheTyrHiaHisLysPhcAsnSerScrGly 
301 AGCCTCAACACCGGCTGGTTGGCAGGGCTTWCTATCACCACAAGTTCAACTCTTCAGGC 
TCGGAGXTGTGGCCGACCAACCGTCCCGAAAAGATAGTGGTGTTCAAGTTGAGAAGTCCG 
. . ^-Overlap with 26 j 

———Overlap with K9-1— 

CysProGluArgLeuAlaSerCysArgPro 
361 TG TCCTG AGAGGCT AGCCAGCTGCCG ACCCC 
ACAGGACTGTGCGATCGGTCGACGGCTGGGG 
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Translation of DNA CA84a 



GlnGlyCysAsnCysSerlleTyrProGlyHisllcThrGlyHiaArgMctAlaTrpAsp 
1 CGCAAWTTGCAATTGCTCTATCTATCCCGGCCATATAACGGGXCACCS^TGGCATGGG 
G CG ITC CAACGTTAACG AG ATAG ATAGGGCCGGTATATTGCCCAGTGGCGTACCGT ACCC 



MetMetMetAsnTrpSerProTnrThrAlal^uValMetAlaGlnl^uLeuArQllcPro 
61 ATATGATGATGAACTGGTCCCCTACGACGGCGTTGGTAATGGCTCAGCTGCTCCGGATCC 
. - TATACTACTACTTGACCAGGGGATGCTGCCGCAACCATTACCGAGTCGACGAGGCCTAGG 



GlnAlalleLeuAjpMetlleAlaGlyAlaHisTrpGlyValLauAlaGlylleAlaTvr 
121 CACAAGCCATCTTGGACATGATCGCTGGTGCTCACTGGGGAGTCCTGCCGGGCATAGCGT 
GTGTTCGGTAGAACCTGTACTAGCGACCACGAGTGACCCCTCAGGACCGCCCGTATCGCA 



— —--Overlap with CAS 9a- 



PheSerMetValGlyAanTrpAlatysValLeuValValLeuLeuLeuPheAlaGlyVal 
181 ATTTCTCCATGGTGGGGAACTGGGCGAAGGTCCTGGTAGTGCTGCTGCTATTTGCCGGCG 
TAAAGAGGTACCACCCCTTGACCCGCTTCCAGGACCATCACGACGACGATAAACGGCCGC 



AspAlaGluThrHiaValThrGly 
24 1 TCGACGCGGAAACCCACGTCACCGGGG 
AGCTGCGCCTTTGGGTGCAGTGGCCCC 
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Translation of DNA CA156e 

CyaTrpVamaMetThr?roThrValAlaThrArgA5pGlyLysLeuProAiaThrGln 
1 GTGTTGGGTGGCGATGACCCCTACGGTGGCCACCAGGGATGGCAAACTCCCCGCGACGCA 
CACAACCCACCGCTACTGGGGATGCCACCGGTGGTCCCTACCGTTTGAGGGGCGCTGCGT 

LcuArgArgHiJllcAapLauLeuValGlySerAlaThxL«uCy«SerAlaLeuTyrVal 
61 GCTTCGACGTCACATCGATCTGCTTGTCGGGAGCGCCACCCTCTGTTCGGCCCTCTACGT 
CGAAGCTGCAGTGTAGCTAGACGAACAGCCCTCGCGGTGGGAGACAAGCCGGGAGATGCA 

GlyAspLcuCysGlySerValPheteuValGlyGlr^cuPheThrPheScrProArgArg 
121 GGGGGACCTATGCGGGTCTGTCTTTCTTGTCGGCCAACTGTTCACCTTCXCTCCCAGGCG 
CCCCCTGGATACGCCCAGACAGAAAGAACAGCCGGTTGACAAGTGGAAGAGAGGGTCCGC 

KisTrpThrThrGlnGlyCysAanCysSerlleTyrProGlyHisllcThrGlyHisArg 
181 CCACTGGACGACGCAAGGTTGCAATTGCTCTATCTATCCCGGCCATATAACGGGTCACCG 
GGTGACCTGCXGCGTTCCAACGTTAACGAGATAGATAGGGCCGGTATATTGCCCAGTGGC 

- -Overlap with CA84a — — 

MetAlaTrpAspMetMetMetAsnTrpSerProThrThrAlaLcuvalvalAlaGlnLeu 
2 4 1 CATGGCATGGGATATGATGATGAACTGGTCCCCTACGACGGCGTTGGTAGTGGCTCAGCT 
GTACCGTACCCTATACTACTACTTGACCAGGGGATGCTGCCGCAACCATCACCGAGTCGA 



LeuAr g I 1 eProG InAl a 
301 GCTC CGG ATC CCAC AAG CC 
CGAGG.CC.TAGGGTGTTCGG 
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Translation of DNA CA167b 

SerThrGlyLcutyrHisValThrAsnAspCysProAsnScrScrlleValTyrCluAla 
1 CTCCACGGGGCTTTACCACGTCACCAATGATTGCCCTAACTCGAGTATTGTGTACGAGGC 
GAGGTGCCCCGAAATGGTGCAGTGGTTACTAACGGGATTGAGCTCATAACACATGCTCCG 

AlaA3pAiaIlcLeuHi$ThrProGlyCy3ValProCyiValArgGluGlyAsnAlaser 
61 GGCCGATGCCATCCTGCACACTCCGCGGTGCGTCCCTTGCGTTCGTGAGGGCAACGCCTC 
CCGGCTACGGTAGGACGTGTGAGGCCCCACGCAGGGAACGCAAGCACTCCCGTTGCGGAG 



ArgCysTrpValAlaMetThrProThrValAlaThrArgAapGlyLysLeuProAlaThx 
121 GAGGTG TTGGGTGG CGATG ACCCCTACGGTGGCCACCAGGCATGGCAAACTCCCCGCGAC 
CTCCACAACCCACCGCTACTGGGGATGCCACCGGTGGTCCCTACCGTTTGAGGGGCGCTG 

— Overlap with CA156e 

GlnLeuArgArgHisIleAapLeuLeuValGlySerAlaThrLeuCysSerAlaLeuTyr 
181 GCAGCTTCGACGTCACATCGATCTGCTTGTCGGGAGCGCTACCCTCTGTTCGGCCCTCTA 
CGTCGAAGCTCCAGTGTAGCTAGACGAACAGCCCTCGCGATGGGAGACAAGCCGGGAGAT 

ValGlyAspLeuCysGlySerValPheLeu 
2 4 1 CGTGGGGGACTTGTGCGGGTCTGTCTTTCTTG 
GCACCCCCTGAACACGCCCAGACAGAAAGAAC 



FIGURE 9 
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truiutloD of Eta cuic* 

Ar?Ar»W«rAr9AraI*tflYtyiValIlU^^ 
1 CCC«CGTA£STCSCG<aATXT0COT 

LeuH«tGiyTyrxle?roUuv»iGlyUiFr^^ 

6 1 ACCTCATOGCGt XCXXAOCfiCTOCTCCaCCCCCCt CrTQGAMCGCTOCCAMOCCCTGG 
T0<ttGTACCCCATCtAT3QCSA0CAa^ 

Hi*<3 IrVllAlVVaaiJiuaiuA^pOlTValJUaTYTAlt XfcxGlyA*aL«uFrcciycy a 

CGCATGG<*TCCGSGTTCTGGAIGACGra 
GCCTAC Cfl OflC Oeg AJOCCtTegCCeCCACCTQAtA 



181 GC1CTTTCTCTATCOTCC5 rCTCC^CTGCtCTCTtKTTSACTGTGCCCGCTTOfiflCCT 

GlAV«lAr9A«3«ttraiyl^TrrHi«YalTte^ 
241 ACCAAGTGCGCAACTCCACGGGGCTTTACtt 

TOGTTC ACtJCflTTOAGtf TGCCCCCAAA10G TGCAG TGGTTACTAACOOGATTGAGCTCAt 



-Owlip vlth CA167*- 



Yuty«iujamsj^pJum«UuKiiitePr^^ 

301 TtGTCTAOSAJtfCXCCSATCCCArCCTOCAa^ 
AACACAttX^TCGCOKXZrAC^ 

GlyA«Jd*S«r*xgC7itrpV4UlAJUtTKrProThrV«lAla 
361 *»OCAACOCCTC3AQGT3rr^ 



FIGURE 10 
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SrantUtiea of t*A c*2*0a 
1 jaaJUaAAAACA^ 

GlMimiGiyclyrUTyrLwiauFrw 

CAGTCTAXAACCACCTOUUttGAAOlACGW^ 

181 «CfilcS2cCC§fl^^ 

<il£^CCS^TeCCClCCTSC*CCCCAC 

eXuCl?eytaiyftBXliCl?ftrp t«u r^u S « r gte>jgCly&*rAr?ProS»gTrpGly 



~A^i*LeuJ^ 



361 



431 



w^<*ccmcgacuuu^ 



FIGURE 11 
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Tr*a«l*tion of OKA *g304 



f Me tSerVal V* lCln ProProCXy P ro Pr oL*u 

#KfitAlaLeuValOP 
l errata nfi CCTCTACccxTccccTtaGTAra 

OCGTCTTTCtW^raOTACCOCAATW 
proGly^luProA* 

61 TCCCGCGAGAGO^TAGTCETWC^ 

ACGGCCCTCTCCGTArc^CCAGACGCeTTGGCCACra 



♦^etProGlyAspLeuGlyvtlPiwroGlaAap 

121 CCWTCCTTTCTTGGATCAACECGCIWT^^ 
GCCCAGGAAAGAACCTAGTlGGGCGACT^^ 

OP AX CljAlmCrs 

cyrAH * 

191 CTCCTAGCCGAGTAGTGTTGGGTCGCGAMGGCCTTGTGGTACTGCCTC^TAOOGTGCTT 
GACGATCGGCrakTCACAACCCACCGCTTTCCG^AACACC^ 

I 

G I uCy« ? roG 1 y Ar^S e tAr gAr^P roCy* Thrtte tScrThiAanf rcLy a ProGl nLy * 
24 1 GCflAGTGCCCCGGGAGGTCTCGTAGACCGTt^ACCATGAGCACGAATCCTAAACCTCAAA 

cgctcacggggccctccagagcatctgcc^cctggtactcgtgcttaggattigcacttt 

L/aAflnLyaAr^A3nThrA3nArgAr$froGIoAapV*lLy'a?h«ProQlyGl7GlyGln 



301 AAAAAAAC AAACGTAACACCAACCGTCGCCCACAGG ACG TCAAGTTCCCGGGTGGCGG TC 
TTTTT?TCTTTCCATTCTCG?7CGC^^ 

1 1 •v«l<5 lyQ lyV«lXyrlj»uL*u*raAr9 Ar gOl y P rcA rg ImuO iyV* 1 Ar gAl alhr 



361 AGATCGTTGGTCGAGTTTACTTGTTTCCGCGCAGGGGCCC7AGATTCGGTGTGCGCGCGA 
TCTACCAACCACCTCAAATCAACAACW^^ 

Ar^LyaThraerGluAr^flerGlnPraAr^lyAMArgOlnProIleProLyaAlaAr? 



4 21 CGAGAAAGACTTCCGAGCGGTCGCAACCICGAOGTAGACGTCAGCCTATCCCCAAGGCTC 
CCTCTTTC7CAAGGC?CGCCACC£TXCCAGra 

ArgfroGl^lyAxgThrtriJaaGlnProGly^^ 



— Ovtrlap with CA390*—- 



4 8 1 GTCGGCCCG AGGGCAGGACCTCGGCTCAGCCCGGCTACCCTTGGCCC^ 

CXSCCGGCCTCCCCTCCTCC ACCCC AOTCGGCCCCATGGGAACCCCCCACATACCCTPAC 



FIGURE 12 -1 
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drCj«fiirrrpAi*oaytrpLcuLeus arProAxgGly$erArgProS*rTrsGlyPro 



*W5CW»»TOSSCGGCATGGCTCCTGTCTCCCCGTCGC^ 
^i^l^roA^AxgtegteAr^Ajml*^ sGly 



CCACAGACCCCCGGCGTACGTCGCGCAATTTGC^T^ 

GCTGTCTCWGGCCCGCXTCCAGCGCGTTAAACCCATTCCAGTAGCTATGGGAATGCACGC 
Pha 



GCTTC 
CGAAG 

» - Start of long HCV ORF 

I - Putativ* first amino acid of lar^e HCY polyprotein 
I - putative aaall encoded peptidase that may play a 
tranalational r»9uiatory role) 
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TraflJl*tioa of OA CA303« 

121 SSgctkmSccc^^ 



181 S^n^SoSc^^ 



• . putative initiator ■ethieniae coaen 



FIGURE 13 
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Er*Mi_*tic& of nam is? 



IPreProOP 

fSerThxMetAsrmsSerProV'alArgAsnTyrCysI^uHifiAlaCluSerYaUM 
»I*ttttl«glaGlugcrteuProCy»QluGlulguL«u^^ 

CJtf*nCCTJ*::?2AC:rCAGGGGACA^ 



|M«tS«xValV4lCLnPrcPrcCl7PrfiProLauProGl7CluProAH 
MetAlaLauValCP 

6 1 ATGGCGTTAGTATG AGTGTCGTG CAGCCTC CAGGACCCCCCC TCCCGGGAG AG CCATAGT 
TACCGCAATCATACTCACAt^CGICGGAGffTCCTGSGC^^ 



121 GGTCTGCGGAACCGGTGAGTACACCGGAAT1GCCAGGACGACCC^GTCCTTTCTTC5GATC 
CCAGACGCCTTGGCCACTCAXGXGGCCTTAACGGTCCXGCT3GCCCAGGAAAGAACCTAG 



Ovarlap with a$36a 



ItetP roGl yA*pL* uGlyVa lProProG LnAs pCysAM 

IS 1 AACCC5CTCAATGCC73G AGATTTGGGCGTGCCGCCSCAAG ACTCCTAGCCGAGTAGTGT 
TTCCflCCACXTACeCACCTCTAAACCCGe*^ 



OP W4 GlyAl*Cy*GluCyoProGlyArgSAr 

241 TGGGTCGCGAAAGGCCTXGTGGTACTGCCTGATAGGGTGCTTGCGAGTGCCCCGGGAGGT 
ACCCAGCGCTTTCCG6AACACCATGACGGACTATCCCACGAACGCTfl\C6GGGCCCTCCA 



ArgArg 



201 CICGTAGA 
CACttTCT 



* - Start of lone KCY ORF 

* - Putativa small encoded peptides(tfcat may play 

a translation*! regulatgry role) • 



FIGUEE 14 
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Translation of dna 16 jh 

Overlap with 15m : 

GlyAlaCysTyrSerlleGluProLeuAspLeuProProIlelleGlnArglJeuHisGly 
1 GGGGC CTGCTACTC CATAG AAC CACTGGAT CTACCTCCAATCATTCAAAGACTC CATGGC 
CCCCGGACGATGAGGTATCTTGGTGACCTAGATGGAGGTIAGTAAGTTTCTGAGGTACCG 

I^uSerAlaPheSerl^uHisSerTyrSerProGlyGlu^^ 
6 1 CTCAGCGCATTTTCACTCCACAGTTACTCTCCAGGTGAAATTAATAGGGTGGCCGCATGC 
GAGTCGCGTAAAAGTGAGGTGTCAATGAGAGGTCCACTTTAATTATCCCACCGGCGTACG 

Gly* 
G 

LeuArgLysLeuGlyvalProProLeuArgAlaTrpArgHisArgAlaArgServalArg 
121 CTCAGAAAACTTGGGGTACCGCCCTTGCGAGCTTGGAGACACCGGGCCCGGAGCGTCCGC 
GAGTCTTTTGAACCCCATGGCGGGAACGCTCGAACCTCTGTGGCCCGGGCCTCGCAGGCG 

AlaAxgLeuLeuAlaArgGlyGlyArgAlaAlalleCysGlyLysTyrLeuPheAsnTrp 
181 GCTAGGCTTCTGGCCAGAGGAGGCAGGGCTGCCATATGTGGCAAGTACCTCTTCAACTGG 
CGATCCGAAGACCGGTCTGCTCCGTCCCGACGGTATACACCGTTCATGGAGAAGTTGACC 

AlaValArgThrLysLeuLys 
24 1 GCAGTAAGAACAAAGCTCAAAC 
CGTCATTCTTGTTTCGAGTTTG 



* - nucleotide heterogeneity 



FIGURE 15 
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COMBINED ORF OF DNA5 pil4a THROUGH ISe 

{pil4a/CA167b/CAl56e/CA84a/CAS9a/X9-X/l2f/14i/llb/7f/7e/ 
8h/33c/40b/37b/35/36/8X/3V33b/2Sc/l4c/8f/33f/33g/39c/ 
35f/19g/26g 6 15e) 

ArgSerArgAsnLeuGlyLysValllftAspThrLeuThrCysGlyPheAlaAapLeuMet 
1 AGGTCGCGCAATTTGGGTAAGGTCATCGATACCCTTACGTGCGGCTTCGCCGACCTCAW 
TCCAGCGCGTTAAACCCATTCCAGTAGCTATGGGAATGCAGGCCGAAGCGGCTCGAGTAC 

GlyTyrlleProteuvalGlyAlaProLauGlyGlyAlaAlaArgAlalAuAlaHisGly 
61 GGGTACATACCGCTCGTCGGCGCCCCTCTTGGAGGCGCTGCCAGGGCCCTGGCGCATGGC 
CCCATGTATGGCGAGCAGCCGCGGGGAGAACCtCCGCGACGGTCCCGGGACCGCGTACCG 

valAr g valLauGl uAj pGly Val AsnTy r Al aThxCl yAs aLmu? roGlyCy s S erPhe 
121 GTCCGGGTTCTGGAAGACGGCGTGAACTATGCAACAGGGAACCTTCCTGGTTGCTCTTTC 
CAGGCCCAAGACCTTCTGCCGCACTTGATACGTTGTCCCTTGGAAGGACCAACGAGAAAG 

SerllePheLeiiLeuAlaLeuLeuSerCysLeuThrvalProAlaSerAlaTyrGlnVal 
131 TCTATCTTCCTTCTGGCCCTGCTCTCTTGCTTGACTGTGCCCGCTTCGGCCTACCAAGTG 
AGATAGAAGGAAGACCGGGACGAGAGAACGAACTGACACGGGCGAAGCCGGATGGTTCAC 

ArgAsnSerThrGlyLeuTyrHiaValThrAsiiAspCysProAjnSerSerlleValTyr 
241 CGCAACTCCACGGGGCTTTACCACGTCACCAATGATTGCCCTAACTCGAGTATTGTGTAC 
GCGT7GAGGTGCCCCGAAATGGTGCAGTGGTTACTAACGGGATTGAGCTCATAACACATG 

GluAlaAlaAapAlalleLeuHisThrProGlyCysValProCyaValArgGluGlyAsn 
301 GAGGCGGCCGATGCCATCCTGCACACTCCGGGGTGCGTCCCTTGCGTTCGTGAGGGCAAC 
CTCCGCCGGCTACGGTAGGACGTGTGAGGCCCCACGCAGGGAACGCAAGCACTCCCGTTG 

AlaSerArgCyiTrpVal^laMatThrProThrValAlaXhrArgAjpGlylyfLauPro 
361 GCCTCGAGGTGTTGGGTGGCGATGACCCCTACGGTGGCCACCAGGGATGGCAAACTCCCC 
CGGAGCTCCACAACCCACCGCTACTGGGGATGCCACCGGTGGTCCCTACCGTTTGAGGGG 

AlaThrGlnl^uArgArgHtsIleAapLeuLeuValGlyScrAlaThrLeuCysSarAla 
4 21 GCGACGCAGCTTCGACGTCACATCGATCTGCTTGTCGGGAGCGCCACCCTCTGTTCGGCC 
CGCTGCGTCGAAGCTGCAG1GTAGCTAGACGAACAGCCCTCGCGGTGGGAGACAAGCCGG 

LeuTyrValGlyA5pLeuCyaGlySerValPheLcuValGlyGlnL«uPheTnrPheSer 
481 CTCTACGTGGGGGACCTATGCGGGTCTCTCmCTTGTCGGCCAACTGTTCACCTTCTCT 
GAGATGCACCCCCTGGATACGCCCAGACAGAAAGAACAGCCGGTTGACAAGTGGAAGAGA 

ProArgArgHi3trpThrThrGlnGlyCysAsnCyaS«rIleTyrPrcGlyHisIleThr 
541 CCCAGGCGCCACTGGACGACGCAAGGTTGCAATTGCTCTAKTATCCCGGCCATATAACG 
GGGTCCGCGGTGACCTGCTGCGTTCCAACGTTAACGAGATAGATAGGGCCGGTATATTGC 

GlyHlsArgMet^aTrpA3pM«t«etMetAsnTrpScrProThrT!irAlaLeuVaLMet 
601 GGTCACCGCATGGCATGGGATATGATGATGAACTGGTCCCCTACGACGGCGTTGGTAATG 
CCAGTGGCGTACCGTACCCTATACTACTACTTGACCAGGGGATGCTGCCGCAACCATTAG 

AlaGlnI*uI*uAi^IleProGlnAlallel*u^ 
661 GCTCAGCTGCTCCGGATCCCACAAGCCAtCTTGGACATGATCGCTGGTGCTCACTGGGGA 
CGAGTCGACGAGGCCTAGGGTGTTCGGTAGAACCTGTACTAGCGACCAGGAGTGACCCCT 

valLeuAlaGlyIleAlaTyrPheSerMetvalGlyAsnTrpAlaLysValL«uValVal 
721 GTCCTGGCGGGCATAGCGTATTTCTCCATGGTGGGGAACTGGGCGAAGGTCCTGGTAGTG 
CAGGACCGCCCGTATCGCATAAAGAGGTACCACCCCTTGACCCGCTTCCAGGACCATCAC 

Letil^ul^uPheAlaGlyValAapMaGluThrHisValThrGlyGlySerAlaGlyHts 
781 CTGCTGCTATTTGCCGGCGTCGACGCGGAAACCCACGTCACCGGGGGAAGTGCCGGCCAC 
GACGACGATAAACGGCCGCAGCTGCGCCTTTGGGTGCAGTGGCCCCCTTGACGGCCGGTG 
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Thrv a IS«<;iyP)ieVaiSerLeuLeuAlaProGlyAlaLysGlnAsnValGIftLeurie 
841 ACTGTG TCTGG A TITO Tl' AGCCTCCTCGCACCAGGCGC CAAGCAG AACGTCCAG C XG A TC 
TGACACAGACCXAAACAATCGGAGGAGCGXGGXCCGCGGXXCGXCTTGCAGGXCGACXAC 

AjnThxAanGlyS«rTrpHi3t«uA5nSerThrAlaL«uA*nCy3AJnAspS€rLeuAsn 
901 AACACCAACGGCAGXXGGCACCXCAAXAGCACGGCCCTGAACXGCAATGAXAGCCXCAAC 
TTGrGGTTGCCGTCAACCGTGGAGTTATCGTGCCGGGACTTGACG TTACTATC GGAGPTG 

ThrGlyTrpLeuAlaGlyI^uPh«TyrHisHljtytPhcAsaS«rSerGlyCysPrcCiu 
961 ACCGGCTGGriGGGAGGGCXXTTCTAXCACCACAAGXXCAACXCXTCAGGCXGXCCTGAG 
TGGCCGACCAACCGXCCCGAAAAGATAGXGGTC 

Ar9l^uAlaSerCysArgProL«uThrAspPheAspGloGlyT^l7?rolleS«rT^ 
1021 AGGCXAGCCAGCTGCCGACCCCXTACCGAXTXXGACCAGGGCTGGGGCCCXAXCAGXTAX 
TCCGATCGGTCGACGGCTGGGGAAXGGCTAAAACTGGTCCCGACCCCGGGATAGTCAATA 

AlaAsnGlySerGlyProAapGlnArgProTyrCysTrpHlsTyrProProLysProCys 
1081 GCCAACGGAAGCGGCCCCGACCAGCGCCCCTACTGCTGGCACTACCCCCCAAAACCTTGC 
CGGXXGCCXXCGCCGGGGCTGGXCGCGGGGAXGACGACCGXGATGGGGGCXXXXGGAACG 

GlyIleValProAlaLysS€rvalCysGlyProValTyrCysPh«thrPros«ProVal 
1141 GGTATTGTGCCCGCGAAGAGTGTGTGTGGTCCGGTATATTGCTTCACTCCCAGCCCCGXG 
CCATAACACGGGCGCTtCTCACACACACCAGGCCATATAAGGAAGTGAGGGTCGGGGCAC 

ValValGlythrthrAspArgSarGlyAlaProThrTyrStrTrpGlyGluAafiAspT^ 
1201 GTGGTGGGAACGACCG ACAGGTCGGGCGCGCCCACCXACAGCTGGGGTGAAAATGATACG 
CACCACCCTTGCTGGCTGXCCAGCCCGCGCGGGTGGAtGXCGACCCCACTTTTACTAXGC 

AspValPheVallAuAjnAsnXhrArgProProLeuGlyA^nTrpPheGlyCysXhrXrp 
1261 GACG7CXTCGXCCXXAACAAXACCAGGCCACCGCTGGGCAAXTGGXXCGGTXGX ACCXGG 
C7GCAGAAGCAGGAAXTGXXAXGGXCCGGXGGCGACCCGXXAACCAAGCCAACAXGGACC 

MetAsnSerXhrGlyPheXhrLysvalCysGlyAlaPtoProCysValileGlyGlyAla 
13 21 ATGAACTCAACXGGAXTCACCAAAGTGTGCGGAGCGCCXCCXTGTGTCAXCGGAGGGGCG 
XACXXGAGXXGACCXAAGTGGXXXCACACGCCXCGCGGAGGAACACAGXAGCCXCCCCGC 

GlyAsnAsnXhrLeuKisCysProThrAspCysPheArgLysKisProAspAlaXhrXyr 
1381 GGCAACAACACCCXGCACXGCCCCACXGAXXGCXTCCGCAAGCAXCCGGACGCCACAXAC 
CCGXXGXTGXGGGACGTGACGGGGTGACXAACGAAGGCGXXCGXAGGCCXGCGGXGXAXG 

SerArgCysGlySerGlyProXrpIleXhrProArgCysLeuValAapTyrProTyrArg 
1441 TCXCGGTGCGGCTCCGGTCCCXGGAXCACACCCAGGTGCCTGGXCGACTACCCGXAXAGG 
AGAGCCACCCCGAGGCCAGGGACCXAGTGIGGGICCACGGACCAGCXGAXGGGCAXAXCC 

LeuXrpKlsXyrProCysXhrlleAsnTyrXhrllePhftLyalleArgMetTyrValGly 
1501 CXTTGGCAXTATCCTTGXACCAXCAACXACACCAXATTIAAAAICAGGAXGXACGTGGGA 
GAAACCGXMXAGGAACAXGGXAGXXGAXGXGGXAXAAAXXTXAGXCCXACAXGCACCCX 

GlyValGluHisArgLeuGluAlaAlaCysAsnXrpXnrArgGlyGluArgCysAspLau 
1561 GGGGXCGAACACAGGCXGGAAGCXGCCXGCAACXGGACGCGGGGCGAACGXTGCGATCXG 
CCCCAGCXXGTGXCCGACCXXCGACGGACGXXGACCXGCGCCCCGCXXGCAACGCXAGAC 

GluAspArgAspArgS«xGluLauSerProI^uI^uI^uXhrThrthrGlnXrpGlnVal 
1621 GAAG AC AGGG ACAGG T CCG AG CTCAGCCCG TXACTGCX GACCACXACACAGTGG CAGG TC 
CXXCXGICCCTGXCCAGGCXCGAGXCGGGCAAXGACGACXGGXCAXGXGXC^CCG 

LftuProCyaSerPheXhrXhrLeuProAlaLeuSerThxGlyLauIleHlsUuKiJGln 
1681 CXCCCGTGTXCCTXCACAACCCXACCAGCCTTGTCCACCGGCCTCATCCACCTCCACCAG 
GAGGGCACAAGGAAGTGTTGGGATGGTCGGAACAGGTGGCCGGAGTAGGTCGAGCTGGTC 

AsnlleValAspValGlnTyrLeuTyrGlyVal^^ 
1741 AACATTGTGGACGXGCAGXACXTGTACGGGGTGGGGTCAAGCAXCGCGXCCTGGGCCAXT 
TTGTAACACCTGCACGTCATGAACATGCCCCACCCCAGXTCGTAGCGCAGGACCCGGTAA 

LysTrpGiuTyrValvall^uLeuPhel^u^ 
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TTCACCCTCATGCAGCAAGAGGACMGGAAGACGAACGTCTGCGCGCGCAGACGAGGACG 

X^uTrpKet^HatX«uI*uIleSerGlii^ 
1861 TTGTGGATGATGCTACTCATATCCCAAGCGGAGGCGGCTTTGGAGAACCTCGTAATACTT 
AACACCTACTACGATGA6TATAGGGTTCGCCTCCGCCGAAACCTCTTGGAGCATTATGAA 

AsnAlaAlaSerLeuAlaGlyThrHi4iGlyteuvalS«rPhftLeuValPhftPh€CysPhe 
1921 AATGCAGCATCCCTGGCCGGG ACGCACGGTCTTGTATCCrrCCTCGTGTTCTTCTGCTTT 
TTACGTCGTAGGGACCGGCCCTGCGTGCCAGAACATAGGAAGGAGCACAAGAAGACGAAA 

AlaTrpTyrLeuLysGlytysTrpValProGlyAlaValTyrThrPheTyrGlyMetTrp 
1981 GCATGGTATTTGAAGGGTAAGTGGGTGCCCGGAGCGGTCTACACCTTCTACGGGATGTGG 
CGTACCATAAACTTCCG^TTCACCCACGGGCCTCGCCAGATGTGGAAGATGCCCTACACC 

ProI^uLeuI^uLeuI^uJ^uAlal^uProG^ 
2041 CCTCTCCTCCTGCTCCTGTTGGCGTXGCCCCAGCGGGCGTACGCGCTGGACACGGAGGTG 
GGAGAGGAGGACGAGGACAACCGCAACGGGGTCGCCCGCATGCGCGACCTGTGCCTCCAC 

AlaAlaSerCysGlyGlyValValL«uValGlyLeuMetAlaLcuThrLeuS«rPraTyr 
2101 GCCGCGTCGTGTGGCGGTCTTGTTCTCGTCGGGTTGATGGCGCTGACTCTGTCACCATAT 
CGGCGCAGCACACCGCCACAACAAGAGCAGCCCAACTACCGCGACTGAGACAGTGGTATA 

Ty rty sArgty rll tS«rTrpCy s L4uTrpTrpLeuGln?yrph*L«uThr Ar gValGl u 
2161 TACAAGCGCTATATCAGCTGGTGCTTGTGGTGGCTTCAGTATTTTCTGACCAGAGTCGAA 
ATGTTCGCGATATAGTCGACCACGAACACCACCGAAGTCATAAAAGACTGGTCTCACCTT 

AlaGlnL«uHi*ValXrpIl«ProProLeuAJnValArgGlyt31yArgAjipAlavalIle 
2221 GCGCAACTGCACGTGTGGATTCCCCCCCTCAACGTCCGAGGGGGGCGCGACGCCGTCATC 
CGCGTTGACGTGCACACCtAAGGGGGGGAGTTGCAGGCtCCCCCCGCGCTGCGGCAGTAG 

LeuLeuMatCysAlaValHisProThrr^uValPheAspIleThrLysJ^uX^uI^uAla 
2 2 B 1 TTACTCATGTGTGCTGTACACCCGACTCTGGTATTTGACATCACCAAATTGCTGCTGGCC 

AATGAGTACACACGACATGTGGGCTGAGACCATAAACTGTAGTGGTTTAACGACGACCGG 

* 

valPheGlVp^l*uTrp!l^euGlnAlas« 
2341 gtcttcggacccctttggattcttcaagccagtttgcttaaagtaccctactttgtgcgc 
cagaagcctggggaaacctaagaagttcggtcaaacgaatttcatgggatgaaacacgcg 

valGlnGlyi^ul^uArgPheCysAlaLeuAlaArgLysHetlleGlyGlyHlstyryal 
2401 GTCCAAGGCCTTCTCCGGTTCTGCGCGTTAGCGCGGAAGATGATCGGAGGCCATTACGTG 
CAGGTTCCGGAAGAGGCCAAGACGCGCAATCGCGCCTtCTACTAGCCTCCGGTAATGCAC 

GlnMetValIleIleLysl*uGlyAlaI*uThxGly^^ 
2461 CAAATGGTCATCATTAAGTTAGGGGCGCTTACTGGCACCTATGTTTATAACCATCTCACT 
GTTTACCAGTAGTAATTCAATCCCCGCGAATGACCGTGGATACAAATATTGGTAGAGTGA 

ProLeuArgAspTrpAlaHiaAanGlyLeuArgAspLeuAlaValAlaValGluProVal 
2521 CCTCTTCGGGACTGGGCGCACAACGGCTTGCGAGATCTGGCCGTGGCTGTAGAGCCAGTC 
GG AGAAGCCCTGACCCGCGTGTTGCCGAACGCTCTAGACCGGCACCGACATCTCGGTCAG 

ValPheSerGlnMdtGluThrLysl^ulleThrTrpGlyAlaAspThrAlaAlacysGly 
2S81 GTCTTCTCCCAAATGGAGACCAAGCTCATCACGTGGGGGGCAGATACCGCCGCGTGCGGT 
CAGAAGAGGGTTTACCTCTGGTTCGAGTAGTGCACCCCCCGTCTATGGCGGCGCACGCCA 

AapIleIlcAsnGlyL«uProValS«rAlaArgAxgGlyArgtSlull€l>uL«uGlyPro 
2641 GACATCATCAACGGCTTGCCTGTTTCCGCCCGCAGGGGCCGGGAGATACTGCTCGGGCCA 
CTGTAGTAGTTGCCGAACGGACAAAGGCGGGCGTCCCCGGCCCTCTATGACGAGCCCGGT 

AlaAspGlyHetValSerLysGlyTrpArgl^uLeuAlaProllelhrAlaTyrAlaGlR 
2701 GCCGATGGAATGGTCTCCAAGGGGTGGAGGTTGCTGGCGCCCATCACGGCG1ACGCCCAG 
CGGCTACCTTACCAGAGGTTCCCCACCTCCAACGACCGCGGGTAGTGCCGCATGCGGGTC 

GlnThrArgGlyt^uI^uGlyCysllelleThrS^ 

'27 61 CAGACAAGGGGCCTCCTAGGGTGCATAATCACCAGCCT^ 

GTCTGTTCCCCGGAGGATCCCACGTATTAGTGGTCGGATTGACCGGCCCTGTTXTTGGTT 
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ValGluGlyGluValGlnIleVal5erThrAlaAlaGlnThrPheLc\iAlaThrCys21e 
2821 GTGGAGGG TGAGGTCCAGATTGTGTCAACTGCTGCCCAAACCTTCCTGGCAACGTGCATC 
CACCTCCCACTCCAGGTCTAACACAGTTGACGACGGGTTTGGAAGGACCGTTGCACGTAG 

AjnGlyValCyalrpThrValTyrHiaGlyAlaGlyThrArgThrlleAlaSerProLys 
-2881 AATGGGGTGTGCTGGACTGTCTACCACGGGGCCGGAACGAGGACCATCGCGTCACCCAAG 
TTACCCCACACGACCTGACAGATGGTGCCCCGGCCTT GC g CC TGGTACCGCAGTGGGTTC 

GlyProValHeGlnMetTyrTIvrAsnValAspGlnAjpL«uvalClyTrpProAlaPra 
2941 GGTCCTGTCATCCAGATGTATACCAATGTAGACCAAGACCTTGTGGGCTGGCCCGCTCCG 
CCAGGACAGTAGGTCTACATATGGTTACATCTGGTTCTGGAACACCCGACCGGGCGAGGC 

GlnGlySerArgS«rLeuThrProC7sThrCysGlySerS«rA*pL«uTyrL«uVallltr 
3001 CAAGGTAGCCGCTCATTGACACCCTGCACTTGCGGCTCCTCGGACCTTTACCTGGTCACG 
GTTCCATCGGCGAGTAACTGTGGGACGTGAACGCCGAGGAGCCTGGAAATGGACCAGTGC 

ArgHiaAlaAspvalilePrcjValArgArgArgGlyAapSarArgGlTSerLeuLeuSer 
3061 AGGCACGCCGATGTCATTCCCGTGCGCCGGCGGGGTGATAGCAGGGGCAGCCTGCTGTCG 
TCCGTGCGGCTACAGTAAGGGCACGCGGCCGCCCCACTATCGTCCCCGTCGGACGACAGC 

ProArgProlleS«rTyrl«uLysGlySerSerGlyGlyProLeuLeuCysProAlaGly 
3121 CCCCGGCCCATTTCCTACTTGAAAGGCTCCTCGGGGGGTCCGCTGTTGTGCCCCGCGGGG 
GGGGCCGGGTAAAGGATGAACTTTCCGAGGAGCCCCCCAGGCGACAACACGGGGCGCCCC 

HlaAlaValGlyllePhcArgAlaAlaValCysThrArgGlyValAlaLysAlaValAsp 
3 1 3 1 CACGCCGTGGGCATATTTAGGGCCGCGGTGTGCACCCGTGGAGTGGCTAAGGCGGTGG AC 
GTGCGGCACCCGTATAAATCCCGGCGCCACACGTGGGCACCTCACCGATTCCGCCACCTG 

PhftireFroValGluAanLeuGluThrThrMetArgSerProValPheThrAapAanSer 
3241 TTTATCCCTGTGGAGAACCTAGAGACAACCATGAGGTCCCCGGTGTTCACGGATAACTCC 
AAATAGGGACACCTCTTGGATCTCTGTTGGTACTCCAGGGGCCACAAGTGCCTATTGAGG 

SerProProValvalProGlnSexPheGlnValAlaHisLauKlaAlaProThrGlySer 
3301 TCTCCACCAGTAGTGCCCCAGAGCTTCCAGGTGGCTCACCTCCATGCTCCCACAGGCAGC 
AGAGGTGGTCATCACGGGGTCTCGAAGGTCCACCGAGTGGAGGTACGAGGGTCTCCGTCG 

GlyLyaSerThrLyaValProAlaAlaTyrAlaAlaGlnGlyTyrLyaValLeuValLeu 
3361 GGCAAAAGCACCAAGGTCCCGGCTGCATXTGCAGCTCAGGGCTATAAGGTGCTAGTACXC 
CCGTTTTCGIGGTTCCAGGGCCGACGTATACGTCGAGTCCCGATATrCCACGATCATGAG 

AsnProSerVamaAlaThrLeuGlyPheGlyAla^^^rtyaA^lsGlyll^ 

3421 ssss^sss^gsgssss^s^ 

3481 fflSSESSS^^ 

CTAGGATTGTAGTCCTG^CCCACTCTTGTTAATGGTGACCGTCGGGGTAGTC 

ThrTyrGlyLysPhetAuAlaAspGlyGlyCysSerGlyGlyAl^ 
3541 ACCTA^C^GTTCCTTCCCGACGGCGGGTGCTCGGGGGGCGCTTATGJ^ 

TGGATCCCGTTCAAGGAACGGCTGCCGCCCACGAGCCCCCCGC6AATACTGTATTATTAA 

3661 CAAGCAGAGACTGCGGGGGCGAGACTGGTTGTGCTCGCCACCGCCACCCCTCCM 
^CGTCTCTCACGC^GCTCTGACCAACACGAGCGGTCGCM 

valThrValProHl'aProAjnlleGluGluVaUlal^uSerT^^lyGlul^o 
3721 GTCACTGTGCCCCATCCCAACATCGAGGAGGTTGCTCTGTCCACCACCG^^ 
CAGTCACACGG^GTAGGGTTGTAGCTCCTCCAACGAG 

B1 "gSBSI^^ 
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AAAATGCCGTTCGJATAG5GG5AGCtTCATTA6rrCCCCCCCTCTGTAGAGTAGAAGACA 

HisSerLystysLysCysAspGluLeuAlaAlaLysLeuValAlateuGlylleAsnAla 
3841 CATTCAAAGAAGAAGTGCGACGAACTCGCCGCAAAGCTGGTCGCAT'XGGCCATCAATGCC 
GTAAGTTTCTTCTTCACGCXGCXTGAGCGGCGTTTCGACCAGCGTAACCCGTAGTTACGG 

ValAlatyrTyrArgGlyL«uAspValSarValll«ProTIurS«rGlyAjpValValVal 
3901 GTGGCCTACTACCGCGGTCTTG ACGTGTCCGTCATCCCGACCAGCGGCGATGrTGTCGTC 
CACCGGATGATGGCGCCAGAACTGCACAGGCAGTAGGGCTGGTCGCCGCTACAACAGCAG 

VaUlaThrAapAlal^uMetThrGlyTyrThrGlyAapPheAapSerValllftAspCys 
3961 GTGGCAACCGATGCCCTCATGACCGGCTATACCGGCGACTTCGACTCGGTGATAGACTGC 
CACCGTTGGCTACGGGAGTACTGGCCGATATGGCCGCTGAAGCTGAGCCACTATCTGACG 

AsnThrCysValThrGlaThrValAspPheSerLauAspProThrPheThrlleGluThr 
4021 AATACGTGTGTCACCCAGACAGTCGATTTCAGCCTWACCCTACCTTCACCATTGAGACA 
TTATGCACACAGTGGGrCTGTCAGCTAAAGTCGGAACTGGGATGGAAGTGGTAACTCTGT 

IleThxLeuProGlnAspAlaValScrArgThrCliiArgArgGlyArgThrGlyArgGly 
4081 ATCACGCTCCCCCAGGATGCTGTCTCCCGCACTCAACGTCGGGGCAGGACTGGCAGGGGG 
TAGtGCGAGGGGGTCCTACGACAGAGGGCGTGAGTTGCAGCCCCGTCCTGACCGTCCCCC 

Ly«ProGlyIl«TyrArgPh«vaiAlaProGlyGluArgProSarGlyM«tPhaAjpSer 
4141 AAGCCAGGCATCTACAGATTTGTGGCACCGGGGGAGCGCCCCTCCGGCATGTTCGACTCG 
TTCGGTCCGTAGATGTCTAAACACCGTGGCCCCCTCGCGGGGAGGCCGTACAAGCWAGC 

SerValLeuCysGluCyaTyrAapAlaGlyCysAlaTrpTyrGluL^uThrProAlaGlu 
4201 TCCGTCCTCTGTGAGTGCTA TGACGCAGGCTGTGCTTGGTATGAGCTCACGCCCGCCGAG 
AGGCAGGAGACACTCACGATACTGCGTCCGACACGAACCATACTCGAGTGCGGGCGGCTC 

Thx*hrValArgL«uArgAlaTyrMetAM^ 
4 261 ACTACAGTTAGGCTACGAGCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGACCAT 
TGATGTCAATCCGATGCTCGCATGTACTTGTGGGGCCCCGAAGGGCACACGGTC 

LeuGluPh«TrpGluGlyValPhaThrt 
4321 CTTGAATTTTGGGAGGGCGTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTATCC 
, GAACTTAAAACCCTCCCGCAGAAATGTCCGGAGTGAGTATATCTACGGGTGAAAGATAGG 

GlnThrLysGlnSerGlyGluA*r^uProTyrlAuValAlaTyrGlRAla^ValCya 
4331 CAGACAAAGCAGAGTGGGGAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTGTGC 
GTCTGTTTCGTCTCACCCCTCTTGGAAGGAA7GGACCATCGCA7GGTTCGGTGGCACACG 

AlaArgAlaGlnAlaProProProScrTrpAspGlnMetTrpLysCysLaulleArgteu 
4441 GCTAGGGCTCAAGCCCCTCCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATTCGCCTC 
CGATCCCGAGTTCGGGGAGGGGGTAGCACCCTGGTCTACACCTTCACAAACTAAGCGGAG 

LysProThrLeuHiaGlyProThrProLeuLeuTyrArgLeuGlyAiaValGlriAsnGlu 
"4 50 1 AAGCCCACCCTC<^TGGGCCAACACCCCTGCTATXCAGACTCGGC 

TTCGCGTGGGAGGTACCCGGTTGTGGGGACGATAtGTCTGACCCGCGACAAGTCTTACTT 

IleThrLeuThrHi*?xuV\arnrLy3-ty^ 
4561 ATCACCCTGACGCACCGAGTCACCAAATACATCATGACATGCATGTCGGCCGACCTGGAG 
TAGTGGGACTGCGTGGGTCAGTGGTTTATCTAGTACTGTACGTACAGCCGGCTGGACCTC 

ValVaimSerThr?rpValL«uValGlyGlyValI^^ 
4621 GTCGTCACGAGCACCTGGGTGCTCGTTGGCGGCGTCCTGGCTGCTTTGGCCGCGTATTGC 
CAGCAGTGCTCGTGGACCCACGAGCAACCGCCGCAGGACCGACGAAACCGGCGCATAACG 

LeuS rThrGlyCyaValVallleValGlyArgValvalLeuSerGlyLysPxoAlalle 
4 681 CTGXCAACAGGCTGCGTGGTCATAGTGGGCAGGGTCGXCTTGTCCGGGAAGCCGGCAATC 
GACAGTTGTCCGACGCACCAGTATCACCCGTCCCAGCAGAACAGGCCCTICGGCCGTTAG 

IleProAaoArgGluValLcuTyrArgGluPheAspGluM tGluGluCysSartlnHia 
4741 ATACCTGACAGGGAAGTCCTCTACCGAGAGTTCGATGACVTGGAAGAGTGCTCTCAGCAC 
TATGGACTGTCCCTTCAGGAGATCCrrr'rr* i«hi« 
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Le^ProTyrlleGIuGlnGlyMetMctLeuAiaGlviGlnPheLyaGlnLysAlaLeuGly 
4801 TTACCGTACA7CG AG CAAGGG ATGATGCTCGCCGAGCAGTTCAAGCAG AAGGCCCTCGGC 
AATGGCATGTAGCTCGTTCCCTACTACGAGCGGCTCGTCAAGTICGTCTTCCGGGAGCCG 

LeuLauGlnThrAlaSerArgGlnAlaGluValllftAlaProAlaValGlnThrAanTrp 
4 861 CTCCTGCAGACCGCGTCCCGICAGGCAGAGGTTATCGCCCCTGCTGTCCAGACCAACTGG 
GAGGACGTCTGGCGCAGGGCAGTCCGTCTCCAATAGCGGGGACGACAGGTCTGGTTGACC 

GlftLysl^uGluThrPheTrpAlaLysHisMe^TrpAsnPhelleSerGlylleGlnTyr 
4 921 CAAAAAC7CGAGACCTTCTGGGCGAAGCATATGTGGAACT7CATCAGTGGGAXACAATAC 
GTTTTTGAGCTCTGGAAGACCCGCTTCGTATACACCTTGAXCTAGTCACCCEATGTTATG 

LeuAlaGlyLeuSerthrl^uProGlyAsnProAlalleAlaS^LeuHfttAlaPhttThr 

4 981 TTGGCGGGCTTGTCAACGCTGCCTGGTAACCCCGCCATTGCTTCATTGAT^ 

AACCGCCCGAACAG7TGCGACGGACCATTGGGGCGGTAACGAAG7AACTACCGAAAATGT 

AlaAlaVaXThrScrProLeuThrThrSerGlnThrLeuLeuPheAsnllftl^uGlyGly 
5041 GCTGCTGTCACCAGCCCACTAACCACTAGCCAAACCCTCCTCTTCAACATATTGGGGGGG 
CGACGACAGTGGTCGGGTGATTGGTGATCGGTTTGGGAGGAGAAGTTGTATAACCCCCCC 

TrpValAlaXLaGlnl^uAlaAlaProGlyAlaAlaThrAlaPheValGlyAlaGlyLeu 
5101 TGGGTGGCTGCCCAGCTCGCCGCCCCOSGTGCCGCTACTGCCTTTGTGGGCGCTGGCTTA 
ACCCACCGACGGGTCGAGCGGCGGGGGCCACGGCGATGACGGAAACACCCGCGACCGAAT 

AXaGlyAlaAlalleGlySerValGlyLeuGlyLyiValLauIleAapIleLeuAlaGly 
5161 GCTGGCGCCGCCATCGGCAGTGTTGGACTGGGGAAGGTCCTCATAGACATCCTTGCAGGG 
CGACCGCGGCGGTAGCCGTCACAACCtGACCCCTTCCACGAGTATCTGTAGGAACGXCCC 

TyrGlyAlaGlyValAlaGlyAlaLeuValAlaPheLyaIl«MctSetGlyGluValPro 
5221 TATGGCGCGGGCGTGGCGGGAGCTCTTGTGGCATTCAAGATCATGAGCGGTGAGGTCCCC 
ATACCGCGCCCGCACCGCCCTCGAGAACACCGTAAGXTCTAGTACTCGCCACTCCAGGGG 

SerThxGluAspL^uValAjnLeuL^uProAi^IleLeuS^rProGlyAlaLeuValVal 

5 2 S 1 TCCACGGAGGACCTGGTCAATCTACTGCCCGCCATCCTCXCGCCCGGAGCCCTCGTAGTC 

AGGTGCCTCCTGGACCAGTTAGATGACGGGCGGTAGGAGAGCGGGCCTCGGGAGCATCAG 





IleAlaPheAlaSerArgGlyAanHisValSerProThrHistyr 



S401 



ACCTACTTGGCCGACTATCGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTGATG 



5461 



5521 



5581 



5701 



5641 




lleThrGlyHisValtysAsnGlyThrHet^ 
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TAG rGACCXGXACAGXXXXTGCCCXGCXACXCCXAGCAGCCAGGATCCXGGACGXCCTTG 

MetTrpSerGlyThrPheProIleAsnAlaTyrThrThrGlyProCysThrProLeuPro 
5821 ATGTGGAGTGGGACCTTCCCCAT7AATGCCTACACCACGGGCCCCTGTACCCCCCTTCCT 
TACACCTCACCCTGGAAGGGGTAATTACGGATGTGGTGCCCGGGGACATGGGGGGAAGGA 

AlaProAjaTyrThrPheAlaLeuTrpArgValSerAlaGluGluTyrValGluIleArg 
5881 GCGCCG AACTACACGTTCGCGCTATGGACGGTGTCTGCAGAGGAATATGTGGAGATAAGG 
CGCGGCTTGATGTGCAAGCGCGATACCTCCCACAGACGTCTCCTTATACACCTCTATTCC 

GlnvalGlyA3pPheHisTyrValThrGlyMetThrThrAspA3nLeul.ysCyaProCyi 
5941 CAGGTGGGGGACTTCCACTACGTGACGGGTATGACTACTGACAATCTCAAATGCCCGTGC 
GXCCACCCCCXGAAGGXGAXGCACXGCCCAXACTGATGACXGXXAGAGXTXACGGGGACG 

GlnValProSerProGluPhePhcThrGluteuAspGlyValArgLeuHisAxgPheAla 
6001 CAGGTCCCATCGCCCGAATTTTTCACAGAATTGGACGGGGTGCGCCTACATAGGTTTGCG 
GTCCAGGGTAGCGGGCTTAAAAAGTGTCTtAACCTGCCCCACGCGGATGTATCCAAACGC 

ProProCysLysProLeuLeuArgGluGluValSerPheArgValGlyLeuHisGluTyr 
6061 CCCCCCTGCAAGCCCTTGCTGCGGGAGGAGGTATCATTCAGAGTAGGACTCCACGAATAC 
GGGGGGACGXXCGGGAACGACGCCCXCCXCCAXAGTAAGTCTCATCCXGAGGXGCXXA7G 

ProValGlySerGlnLauProCysGluProGluProAapValAlaValLeuThrSerMet 
6121 CCGGTAGGGTCGCAATTACCTTGCGAGCCCGAACCGGACGTGGCCGTGTTGACGTCCATG 
GGCCATCCCAGCGTTAATGGAACGCTCGGGCTTGGCCTGCACCGGCACAACTGCAGGTAC 

L^uThrAspProSerKiJlleThrAlaGluAlaAlaGlyArgApgLeuAlaAxgGlySer 
6181 C7CAC7GATCCCTCCCATATAACAGCAGAGGCGGCCGGGCGAAGGTTGGCGAGGGGATCA 
GAGTGACTAGGGAGGGXAXAXTGTCGXCXCCGCCGGCCCGCXXCCAACCGCXCCCCXAGX 

ProProSexValAlaScrSerSerAlaSerGlnLeuSerAlaProSerLeuLysAlaThr 
6 24 1 CCCCCCTCTGTGGCCAGCTCCTCGGCTAGCCAGCXATCCGCTCCAXCTCTCAAGGCAACT 
GGGGGGAGACACCGGXCG AGG AGCCG ATCGGTCG ATAGGCG AGGT AG AG AGTXCCGTXG A 

CysThrAlaAsnHlsAspSarProAspAlaGluLauIleGluAlaAsnLeuLeuTrpArg 
6301 TGCACCGCTAACCATGACTCCCCTGATGCTG AGCTCATAGAGGCCAACCTCCTATGGAGG 

ACGTGGCGATTGGTACTGAGGGGACTACGACTCGAGTATCTCCGGTXGGAGGATACCTCC 

• •• . 

GlnGluMet:GlyGlyA3nlleThrArgvalGiuScrGluAsnLysvalValIleLeuAsp 
6361 CAGGAGATGGGCGGCAACAXCACCAGGGXXGAGTCAGAAAACAAAGTGGTGATTCTGGAC 
GTCCTCTACCCGCCGTTGXAGTGGTCCCAACTCAGTCTTTTGTTTCACCACTAAGACCTG 

SerPheAspProLauValAlaGluGluAspGluArgGluIleSerValProAlaGluIle 
6421 XCCXTCGAXCCGCIXGTGGCGGAGGAGGACGAGCGGGAGAXCXCCGXACCCGCAGAAAXC 
AGGAAGCXAGGCGAACACCGCCXCCXCCTGCXCGCCCXCXAGAGGCAXGGGCGXCXXXAG 

LeuArgLysSerArgArgPheAlaGlnAlaLeuProValTrpAlaArgProAspTyrAsn 
6481 CXGCGGAAGICXCGGAGAXXCGCCCAGGCCCXGCCCGXXXGGGCGCGGCCGGACXAXAAC 
GACGCCXXCAGAGCCXCXAAGCGGGXCCGGGACGGGCAAACCCGCGCCGGCCXGAXAXXG 

ProProLeuValGluXhrXrpLysLyaProAapTyrGluProProValValHlaGlyCys 
6541 CCCCCGCXAGXGG AGACGXGGAAAAAGCCCGACXACGAACCACCXGXGGXCCAXGGCXGX 
GGGGGCGAXCACCTCTGCACCTTXXXCGGGCTGATGCTTGGXGGACACCAGGXACCGACA 

ProLeuProProProLyaScrProProValProProProArgLysLysArgXhxvalval 
6601 CCGCXTCCACCXCCAAAGXCCCCXCCXGXGCCXCCGCCXCCWAAGAAGCGGACGGXGGXC 
GGCGAAGGXGG AGG XIXCAGGGG AGGACACGGAGGCGGAGCCTXCXXCGCCXGCCACCAG 

LeuIhrGluSerXhrLeuS erXhr AlaLcuAlaGluLeuAlaThrArgSerPheGlySer 
6661 CXCACXGAAXO^CCCXAXCXACTGCCXXGGCCGAGCXCGCCACCAGAAGCTXTGGCAGC 
GAGXGACTTAGITGGGAXAGAXGACGGAACCGGCICGAGCGGXGGXCITCGAAACCGXCG 

SerSerXhrSei<UyrieXhrGlyAspAsnThrXhrt^ 
6721 TC CTCAAC XXC CGG CA T T ACGGGCG ACAAT ACGACAACATC CXCTG AGCCCGCCCCXXCX 
AGGAGTreaA^rrflTAATrtfleMrw^ 
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6791 



6841 



6901 



6961 



CCGACGGGGGGGCXGAGGCTGCGACTCAGGATAAGGAGGTACGGGGGGGACCTCCCCCTC 
CCTGGGGATCCGGATCTTAGCGACGGGTW^ 

GluAapValValCy3CysSerM*tSerTyrS«rTrpThrGlyAlaL«uValThrProCvs 
GAGGATGTCGTGTGCTGCTCAAXGTCTTACTCTTGGACAGGCGCACTCGTCA^CC^GC 
CTCCTACAGCACACGACGAGTTACAGAATGAGAACCTGXCCGCGTGAGCAGTGGGGCACG 

AlaAlaGluGluGlnLysLeuProIleA«nAlaLcuSerA3nSerLeuL€uArgHisHis 
GCCGCGGAAGAACAGAAACTGCCCATCAATGCACTAAGCAACTCGTTGCTACGTCACCAC 
CGGCGCCTTCttGTCTTTGACGGGTAGTTACGTGATTCGTTGAGCAACGATGCAGTGGrG 

7021 ilrac?wi^ 

XXAAACCACAXAAGGXGGXGGAGXGCGXCACGAACGGXXXCCGXCXXCXXXCAGXGXAAA 

AspArgLeuGlnValL«uA3pSerHi3TyrGlnAspValLeuLysGluValLysAlaAla 
7091 GACAGACTGCAAGTTCTGGACAGCCATTACCAGGACGTACTCAAGGAGGTTAAAGCAGCG 
CTGTCTGACGTTCAAGACCTGTCGGTAATGGTCCTGCATGAGTTCCTCCAATTTCGTCGC 

Ala£erLyaValLysAlaAjnL«uL€uS«rValGluGluAlaCysS«rL«uThrProPro 
7141 GCGTCAAAAGTGAAGGCTAACTTGCTATCCGTAGAGGAAGCTTGCAGCCTGACGCCCCCA 
CGCAGTTTTCACTTCCGATTGAACGATAGGCATCTCCTTCGAACGTCGGACTGCGGGGGT 

HisSerAlaLysSerLysPheGlytyrGlyAlaLysAapValArgCysHisAlaArgLys 
7201 CACTCAGCCAAATCCAAGTTTGGTTATGGGGCAAAAGACGTCCGTTGCCATGCCAGAAAG 
GTGAGTCGGTTTAGGTTCAAACCAATACCCCGTTTTCTGCAGGCAACGGTACGGTCTTTC 

AlaValThrHisIleAsnSerValTrpLysAapLeuLauGluAspAsnValThrProIle 
7261 GCCGXAACCCACAXCAACXCCGXGXGGAAAGACCXXCXGGAAGACAAXGXAACACCAAXA 
CGGCAXTGGGXGXAGXTGAGGCACACCXXTCXGGAAGACCXXCXGXTACAXTGXGGXXAX 

A5pThrThrIletotAlaLyaAjiiK51uVaIPh«<^»v%i^inProGluLysGlyGlvAr 
7 3 2* GACACT ACCATCATGGCTAAG AACGAGG rTT T C C G COTTCAGceTOAGAAoodGGGTCGr 
CTGTGATGGTAGTACCGATTCTTGCTCGiAAAGACGCAAGTCGGACTCTTCCCCCCAGCA 

LysProAlaArgL«uXlcValPheProAspL«uGlyValArgValCysGluLy$MetAla 
7381 AAGCCAGCTCGTCTCATCGTGTTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAG ATGGCT 
TTCGGTCGAGCAGAGTAGCACAAGGGGCTAGACCCGCACGCGCACACGCTTTTCTACCGA 

LeuTyrAapValVairhrLysLeuProLeuAlaValJMetGlySerSerTyrGlyPheGln 
7441 TTGTACGACGIGGTTACAAAGCTCCCCTTGGCCGTGATGGGAAGCTCCXACGGATTCCAA 
AACAXGCXGCACCAAXGXTXCGAGGGGAACCGGCACTACCCTTCGAGGAXGCCXAAGGXX 

TyrSerProGlyGlnArgValGluPheLeuValGlnAlaTrpLysScrLysLyaThrPro 
7501 TACTCACCAGGACAGCGGGTTGAATTCCTCGTGCAAGCGTGGAAGTCCAAGAAAACCCCA 
ATGAGTGGTCCTGTCGCCCAACTTAAGGAGCACGTTCGCACCTrCAGGTTCTTTTGGGGT 

HetGlyPheSertyrAspThrArgC^sPheAjpSerthrValThrGluSerAspIleArg 
7 S 6 1 AXGGGGXXCXCGXAXGAXACCCGCXGCXXXGACXCCACAGXCACXGAGAGCGACAXCCGX 
XACCCCAAGAGCAXACXAXGGGCGACGAAACXGAGGXGXCAGXGACXCXCGCIGXAGGCA 

IhrGluGiuAlalleTyrGlnCysCyjAspUuAjpPrcDGlnAlaArgValAlaileLys 
7621 ACGG AGGAGGCAAXCXACCAAXGXXGXGACCXCGACCCCCAAGCCCGCGXGGCCAXCAAG 
TGCCXCCXCCGXXAGATGGXXACAACACXGGAGCTGGGGGXTCGGGCGCACCGGXAGXXC 

SerLeuxhrGiuArgUuxyrValGlyGlyProUuXhrAJnS rArgGlyGluAsnCys 
7681 TCC CTCA C CG AG AGG C TXXA TG XTG GGGG C CCXCXX ACCAAXXCAAGGGGGGAG AACTGC 
AGGGAGXGGCXCXCCGAAAXACAACCCCCGGGAGAAXGGXXAAGXXCCCCCCXCXTGACG 
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CCGATAGCGXCCACGGCGCGCTCGCCGCATGACTGTTGATCGACACCATTGTGGGAGTGA 

Cy sTyrl leLy sAl aAr gALaAlaCy sArgAl aAlaGlyLeuGlnAspCysThrMetLeu 
.7801 TGCTACATCAAGGCCCGGGCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATGCTC 
ACGATGTAGTTCCGGGCCCGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTACGAG 

valCysGlyAspAspLeuValvallleCysGluSerAlaGlyValGlnGluAspAlaAla 
7861 GTG TGTGGCGACGACTTAGTCGTTATCTGTGAAAGCGCGGGGGTCCAGGAGGACGCGGCG 
CACACACCGCTGCTGAATCAGCMTAGACACTTTCGCGCCCCCAGGTCCTCCTGCGCCGC 

SerLeuArgAlaPheThrGluAlaMetThrArgTyrSerAlaProPrcGlyAspProPro 
7921 AGCCTGAGAGCCrrCACGGAGGCTATGACCAGGTACTCCGCCCCCCCTGGGGACCCCCCA 
7CGGACTCTCGGAAGTGCCTCCGATAC7GGTCCATGAGGCGGGGGGGACCCCTGGGGGGT 

GlnProGluTyrA»pLeuGluLeuIlcThrSerCysSerS«rA»nValS«rvalAlaHtst 
7981 CAACCAGAATACGACTTGGAGCTCATAACATCATGCTCCTCCAACGTGTCAGTCGCCCAC 
GTTGGTCTTATGCTGAACCTCGAGTATTGTAGTACGAGGAGGTTGCACAGTCAGCGGGTG 

AspGlyAlaGlyLysArgValtyrTyrLeuThrArgAspProThrThrProLeuAlaArg 
8041 GACGGCGCTGGAAAGAGGGTCTACTACCTCACCCGTGACCCTACAACCCCCCTCGCGAGA 
CTGCCGCGACCTTTCTCCCAGATGATGGAGTGGGCACTGGGATGTTGGGGGGAGCGCTCT 

AlaAlaXrpGluThrAlaArgHisThrProYalA»nStrtrpL«uGlyAjRllaIleMet 
8101 GCXGCGXGGGAGACAGCAAGACACACXCCAGXCAAXXCCXGGCXAGGCAACAXAAXCATG 
CGACGCACCCXCTGXCGXXCXGXGXGAGGICAGXXAAGGACCGAXCCGXXGXAXXAGXAC 

PheAlaProXhrlAuTrpAlaArgMetlleLeuMetXhrHlsPhePheSerValLeuIle 
8161 XXTGCCCCC»CACTGTGGGCGAGGAXGAXACXGAXGACCCAXXXCXXXAGCGXCCXXAXA 
AAACGGGGGXGXGACACCCGCXCCXACXAXGACXACXGGGXAAAGAAAXCGCAGGAAXAX 

AlaArgAspGlnl^uGluGlnAlaLauAspCysGlulleTyrGlyAlaCysXyrSarlle 
8221 GCCAGGGACCAGCXTGAACAGGCCCTCGAXXGCG AGAXCXACGGGGCCXGCXACXCCAXA 
CGGTCCCTGGXCGAACIXGXCCGGGAGCXAACGCXCXAGAXGCCCCGGACGAXGAGGXAI 

GluProLauAsptauProProllelleGlnArgLau ^ 
8281 GAACCACTTGATCTACCTCCAATCATTCAAAGACTC 
CTTGGTGAACTAGATGGAGGTTAGTAAGTTTCTGAG 
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-319 CACTC CACCATGMTCACTCCCC TGTGAGG AACTACTG TCTOCACG CAG AMGCG T CTAG 
GTGAGGTGGTACTTAGTGAGGGGACACTCCTSGATGACA 

-259 CCATOGCGMMak^ 

GG!EACCGCAATCATACTCACAGCACGXCGGAGGTCC^ 

-139 GTGGTCTGCGGAACCGGTGAGTACACCGGAATTGCCAGGA^ 
CACCAGACGCCI-TGGC<^^ 

-13 9 TCAACCCGCTCAATGCCTG^^ 
AGCTGGGCGAGTI&GG<^^ 

- 79 GSTGGG^GCGAAAGGCCITOTC^ 

CAACCCAGCGCTXTCCGGAACACCAIGACGGACSAIK 

- 19 GTCTCGTAGACCGTGCACC 

CAfiAGCATCTCGCACGTGG 

Arg Thr 

MetSerThrAsnProLys?rc^liiLysLysAsiiLyaArgAsnThrA3nArgArgProGln 
1 ATGAG CACGAATCCTAAACCTCAAAAAAAAAACAAAGG IAACACGAACCGTCGCCCACAG 
TACTCGTCCTEAGGATTTGGM^ 

AspValLysPhaProGlyGlyGlyGlnlloValGlyGlyValO^rieuLeuProArgArg 

61 gacgxcmgttcccgggtggcggtcagatcgmggtggagmtacttgttgccgcgcagg 
ctgcagttcaagggcccaccgccagtctagcaaccacctcaaatgaacaacggcgcgtcc 

GlyPrc^gLeuGlyValArgAlalhrA^ 
121 GGCCCTAGAITGGGTGTCCGCGCGACGAGAAAGACTICCGAGCGGTCGCAACCTCGAGGT 
CCGGGATCTi^CCCACACGCGCGCMCTCTTTCTGAAGGCtrCGCCAGCGTTGGAGCTCCA 

Fig... 17-1 

AxgArgGliiProileProLysAlaArgArgPrc^luGlyArgThr^rpAlaGliiProGly 
181 AGACGTCAGCCTATCCCCAAGGCTCGTCGGCCCGAGGGCAGGACCTGGGCTCAGCCCGGG 
TCTGCAGTCGGATAGGGGTTCCGAGCAGCCGGGCTCCCGTCCTGGACCCGAGICGGGCCC 

TyrProIrpProl^uTyrGlyAsnGlu^^ 
241 ^ACCCTTGGCCCCICCTATGGCAATGAGGGCTGCGGGTGGGCGGGATGGCTCCTGTCTCCC 
ATGGGAACCGGGGAGATACCGTTACXCCCGACGCCCACCCGCCCTAGCGAGGACAGAGGG 

ArgGlySerArgProSerTrpGlyProThrAspPr^ 
301 CGTGGCTCTCGGCCTAGCXGGGGCCCCACAGACCCCCGGCGTAGGTCGCGCAATTTGGGT 

LysVallleAspShrl^uafcrCysGlyPh^ 
361 AAGGTCATCGATACCCTTACGK^^ 

TTCCMrAGCTATGGGAATGCACGCCGAAGCGGCTGGAGIACCCCATGTATGGCGAG 

GlyAlaPraLeuGlyGlyAlaAla^ 
421 GGCGCCCCTCTTGGAGGCGCTGCCAGGGGCCTGGCGCATGGCGrCCGGGTTCTGGAAGAC 
CCGCGGGGAGAACCTCCGCGACGGTCCCGGGACCGCGTACCGCAGGCCCAAGACCTTCTG 

Thr 

GlyValAanTyrMatfhrGlyA^ 

4 81 GGCGTGAACTACK3CAACAGGGAACCTTCCTGGTTGCTCTTTCTCTATCTTCCTTCTGGCC 
CCGCACTTOATACGTTGTCCCT^^ 

LeuLeuSerCVsX^uThzValPr 
541 CTG CTCTCTTG CTTGACTGTG CCCG COTCG GC CTACCAAG TGCG CAACTCCACG GGG CTT 
GACGAGAGMCGAACTGACACGGGCGAAGCCGGATGGTTCACGCGTTGAGGTGCCCCG 

TyrHisVzaThxAsnAspCysProAsn^^ 
601 TACCACGTCACCAATCATTGCCOTAACTCG 

s si ctcScactccggggtgcgtcccttgcgttostgagggcaacgcct 
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AlaMetThrProThrValMaThr^ 
7 21 GCGATCACCCCTACGGTOTCCACCA^ 

CGCTACTGGGGAIGCCACCGG TGG TCCCT ACCG TTTGAGGGG CG CTGCGT CGAAGCTGCA 

HisIleAflpI^uI^uValGlySerAlaThrLeuCysSexAlaLeuTyrValGlyAspLau 

781 CAG^CGATCTGCITGTCGGGAGCGC 

GTGTAGCTAGACGAACAGCCCTCGCGGTCGGAGACAAGCCGGGAGATGCACCCCCTGGAT 

CvsGlyS^ValPhel^uValGlyGln^ 
84 1 TCCGGGTC0X3TCTTTCTTC!I^ 

ACG CCCAGACAGAAAGAACW3CCGGMGACAAGTGGAAGAGAGGGTCCGCGG 

TjirGlnGlyCysAsnCysSearlleO^Pr^^ 
$01 ACGCAAGGTIGCAATTGCTCTATCTATCCCGGCCATATAACGGGTCACCGCMGGCATGG 
TGCGSTCCAACGMAACGAGAIAGAT^ 

Val 

Asr^etMetMetAsnTrpSerProT^ 

961 GASATGa^WNkC^ 

CTATACTACTACmACCAGMGATGCTGCCGCAACCa 

ProGlnAlallaLeuAflpMetlleAlaG^ 
1021 CCACAAGCCATCTTGGACA2GATCGCTGGTGCTCACTGGGGAGTC 
GGTGTICGGTAfiAACCTGTACTAGCGACCACGAGT^ 

TyrPtoSerMetValGlyAsaTrp^ 
10S1 TATITCTCCATGGT GGGGAACTGGGCG AAGGTCCTGGTAGTGCTG CTGCTATMGCCGGC 
ATAAAGAGGTACCACCCC1TGACCCGCTTCCAGGACCATCACGACGACGATAAACGGCCG 

ValAspAlaGluThrHisValThrGlyGlySemaGlyHiaThrValSerGlyPheVal 
114 1 GTCGACGCGGAAACCCACGTCACCGGGGGAAGTGCCGGCCACACTGTGTCTGGATTTGTT Fig* 17-2 
CAGCTGCGCCTTTGGGTGCAGTGGCCCCCTTCACGGCCGGTGTGACACAGACCTAAAGAA 

Serl^uI^uAlaProGlyAlaLysGlnAsnVa^ 
120 1 AGCCTCCTCGCACCAGGCGCCAAGCAGMCGTCCAGCTGATCAACACCAACGGCAGMGG 
TCGGAGGAGCGTGGTCCGCGGMCGTCTTGCAGGTCGACTAGMGTGGTTGCCGTCAACC 

Kiel^uAsJiSerJhrAlaLeuAs^ 
1261 CACCT CAATAGCACGGCC CTGAAC TGCAATGAIAG CCTCAACACCGGCTGGTTGG CAGGG 
GTCGAGTTASCGXGCaSGGACT^^ 

LeuPheTyrHisHisLysPheAsnSersL^ 

1321 CCTTTCTASCACCACAAGro^ 

GAAAAGATAGTGGTGTXCAAGMGAGAAGTCCGACAGGACTCTCCGATCGGTCGACGGCT 

ProLeu^AspPheAspGluGlyTrpGlyProneSemrAlaAsiiGlySerGlyPro 
13 Bl CCCCOTACCGAOTT^^ 

ggggaatggctaaaactggtcccgacc^^ 

AfnXjlaArsProTyrCysa^ 
1441 GA^GCGCCCCTACKCTGGCACTACCCCCCAAAACCTTGCGGTATTC 

CTCOTCGCGGGGATGACGACCGTGATGGGGGGTTOTGGAACGCCATAACACGGGCGCTTC 

SerValCysGlyProValO^rCyaPlieThrP^ 
1501 MreiGTGTGCTCCGGT^^ 

Ar^SerGlyAlaProThrTyrSerTrpGlyGluAa 
15S1 AGGTCGGGCGCGCCCACCTACAGCTGGGGTGAAAATGATACGGACGTCTTCGTCCTTAAC 
TCCAGCCCGCGCGGGTGGAJTGTCGACCCCACTTTIIACTATGCCTGC^AAGCAGGAATTC 

1621 gSggg^^ 
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ThrLysValCyaGlyAlaProProCyaVallleGlyGlyAlaGlyAenAsnThrLeuHis 
1681 ACD^GTGTGCGGAGCGCCTCCTTGTGTCATCGGAGGGGCGGGCAACAACACCCTGCAC 
TGGTWCACACGCCTCGCGGAGGAACACAGTAGCCTCCCCGCCCGTTGTTGTGGGACGM 

CYsProThrAfipCyaPheArgLyaHlaPrcAspAlaThrTyrSerArgCyeGlySerGly 
1741 TGCCCCACTGATTGCTTCCGCAAGCATCCGGACGCCACATACTCTCGGTGCGGCTCCGGT 
ACGGGGTGACTAACGAAGGCGTTCGTAGGCCTGCGGTGTATGAGAGCCACGCCGAGGCCA 

Leu 

ProTrpileThrProArgCyaLeuValAflpTyrProTyrArgLeuTrpHisTyrProCya 

1801 CCCTGGATCACACCCAGGTGCCTGGTCGACTACCC2GTATAGGCTOTGGCATTATCCTTGT 
GGGACCTAGTGTGGGTCCACGGACCAGCTCATGGGCATATCCGAAACCGTA&TAGGAACA 

^ileAsnTyxIhrilePheLyslleAxgMetTyrValGlyGlyValGluHisArgLeu 
1861 accatcaactacaccatatttaaaatcaggatgtacgtgggaggggtcgaacacaggctg 

TGGTAGTTCATGTCGTATAAATTTIIAGTCCT^ 

GluAlaAlaCyaAsnTrpThrArgGlyGluArgCysABpLeuGluAspArgAapArgSer 
1921 GAAGCTGCCTGCAACTGGACGCGGGGCGAACGTTGCGATCTGGAAGACAGGGACAGGTCC 
CTTCGACGGACGTTGACCTGCGCCCCGCTTGCAACGCTAGACCTTCTGTCCCTGTCCAGG 

GluI^uSerProlAuLeuLeuThrThrThrGlnTrpGlaVall^uProCyaSerPheThr 
1981 GAGCTCAGCCCGTTACTGCTGACCACTACACAGTGGCAGGTCCTCCCGTGTTCCTTCACA 
CTCGAGTCGGGCAATGACGACTGGTGATGTGTCACCGTCCAGGAGGGCACAAGGAAGTGT 

Thrl^uProAlaLeuSerThrGlyLeulleHisLeuHiaGlnAanlleValAspValGln 
2041 ACCCTACCAGCCTTGTCCACCGGCCTCATCCACCTCCACCAGAACATTGTGGACGTGCAG 
TGGOATGGTCGGAACAGGTGGCCGGAGTAGGTGGAGGTGG1CTTGTAACACCTGCACGTC 

Tvri^uTyrGlyvalGlySerSerllaAlaSerTrpAlalleLysTrpGluTyrValVal 
2101 TACTTGTACGGGGtGGGGTCAAGCAICGCGTCCTGGGCCATTAAGTGGGAGTACGTCGTT Pig. 17-3 
ATGAACAiTGCCCCACCCCMTTCGTAGCGCAGGACCCGGTAATTCACCCTCATGCA 

2161 CTCCTGTTCCTTCTGCTTGCAGACGCGCGCG'TCTGCTCCTGCTTGTGGATGATGCTACTC 
GAGGACAAGGAAflACGAACGTCTGCGCGCGCAGACGAGGACGAACACCTACTACGATGAG 

IleSerGlnMaGluAlaAlaLeuGluAsnl^uVallleLeuAsnAlaAlaSexLeuAla 
2221 ATATCCCAAGCGGAGGCGGCTTTGGAGAACCTCGTAATACTTAATGCAGCATCCCTGGCC 
TATAGGGTTCGCCTCCGCCGAAACCTCTTGGAGCATTATGAArrACGTCGTAGGGACCGG 

GirrhrHisGlyLeuValSarPhfiLeuValPhePheCyaPheAlaTrpTyrLeuLysGly 
2281 GGGACGCACGGTCTTGTATCCITCCTCGTGTTCTTCTGCTTTGCATGGTAITITGAAGGGT 
CCCTGCGTGCCAGAACATAGGAAGGAGCaCAAGAAGACGAAACGTACCAlAAACTTCCCA 

LysTrpValProGlyAlaValTyrThrPheTyrGlyMotTrpProLeuI^injeulAuLau 
2341 AAGTGGGTGCCCGGAGCGfiTCTACACCTTCTACGGGAiTGTGGCCTCTCCTCCTGCTCCTG 
TTCACCGICGGGCCTCGCGAGATGTGGAAGATGCCCTACACCGGAGAGGAGGACGAGGAC 

I^uAlaLeuProGlaAxgAlaTyrAlateuAapTh^ 

2401 ttggcgttgccccagcgggcgtacgcgctggacacggaggtggccgcgtcgtgtggcggt 
aaccgcaacggggtcgcccgcatgcgcgacctgtgcctccaccggcgcagcacaccgcca 

valVall^uValGlyl^uMetAlaLeuThrLeuSerProTyrTyrLysArgTyrlleSex 
2461 GTTG TTCTCGTCGGGTTGATGG CGCTGACTCTG TCACCAIATT ACAAGCGCTATATCAGC 
CAACAAGAGCAGCCCAACTACCGCGACTGAGACA^ 

Asa 

Tror^aLftuTrDTrTJLeuGlaTvrPhel^uThrArgValGluAlaGlnlauHisValTrp 
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HisProThrl^uValPheAapIleThrLyeLeuLeuLeuAlaValPheGlyProLeuTrp 
2641 CACCCGACTCTGGTATTTGACATCACCAAATTGCTGCTGSCCGTCTTCGQACCCCTTTGG 
GTGGGCTGAGACCATAAACTGTAGTGGTMAACGACGACCGGCAGAAGCCTGGGGA 

Ilel^uGlnAlaSerl^uI^uLysValProTyrPhaValArgValGlnGlyLeuLeuArg 
2701 ATTCTTCAAGCCAGTTTGCTTAAAGTACCCTAC1TTGTGCGCGTCCAAGGCCTTCTCCGG 
TAAGAAGTTCGGTCAAACGAATTTCATGGGATGAAACACGC^^ C 

PheCysAlaLeuAlaArgLysMetlleGlyGlyHisTyrValGliiMetValllelleLya 

2761 TTCTGCGCGTTAGCGCGGAAGATGATCGGAGGCCATTACGTGCAAATGGTCATCATTAAG 
AAGACGCGCAATCGCGCCTTCTACTAGCCTCCGGIAATGCACGTTTACCAGTAGTAATTC 

Ix2uGlyAlaI^uThxGlyTtoTyrValT^^ 
2821 TTAGGGGCGCTTACTGGCACCTATGTTTATAACCATCTCACTCCTCTTCGGGACTGGGCG 
AATCCCCGCGAATGACCGTGGATACAAATATTGGTAGAGTGAGGAGAAGCCCTGACCCGC 

HisAsnGlyLeuArgAspLeuAlaValAlaValGluProValValPheSerGlnMetGlu 
2881 CACAACGGCTTGCGAGATCTGGCCGTGGCTGTAGAGCCAGTCGICTTCTCCCAAATGGAG 
GIGTTGCCGAACGCTCTAGACCGGCACC^CATCTCGGTCAGCAGAAGAGGGTTTACCTC 

ThrLysLeuIleThrTrpGlyAlaAspThrAlaAlaCysGlyAapIlelleAsnGlyLeu 
2941 ACCAAGCTCATCACGTGGGGGGCAGATACCGCCGCGTGCGGTGACATCATCAACGGCTTG 
TGGTTCGAGTAGTGCACCCCCCGTCTATGGCGGCGCACGCCACTGTAGTAGTTGCCGAAC 

ProValSerAlaArgArgGlyArgGlulleLeuLeuGlyPrcAlaAspGlyMetValSer 
3 001 CCTGTTTCCGCCCGCAGGGGCCGGGAGATACTGCTCGGGCCAGCCGA1TGGAATGGTCTCC 
GGACAAAGGCGGGCGTCCCCGGCCCTCTATGACGAGCCCGGTCGGCTACCmCCAGAGG 

LysGlyTrpArgLeuLeuAlaProIleThrAlaTyrAlaGlnGlnThrArgGlyLauLeu 
3061 AAGGGGTGGAGGTTGCTGGCGCCCATCACGGCGTACGCCCAGCAGACAAGGGGCCTCCTA 
TTCCCCACCTCCAACGACCGCGGGTAGTGCCGCATGCGGGTCGTCTGTTCCCCGGAGGAI 

GlyCysIlelleThrSerLeuThrGlyArgAapLyaAsnGlnValGluGlyGluValGln 
3121 GGG TGCATAATCACCAGCCTAAGTGGCCGGGACAAAAACCAAGTGGAGGGTGAGGTCCAG 
CCCACGTATTAGTGGTCGGATTGACCGGCCCTGTTTTTGGTTCACCTCCCACTCCAGGTC 

ileValSer^AlaAlaGlnThrPheLeuAlaThrCyslleAanGlyValCysTrpThr Fi 17 _ 4 
3181 ATTGTGTCAACTGCTGCCCAAACCTTCCTGGCAACGTGCATCAATGGGGTGTGCTGGACT r iy * 
TAACACAGTCDGACGACGGGTTTGGAAGGACCGTTGCACGTAGTTACCCCACACGACCTGA 

VaiaYrHisGlyAlaGlyThrArgThrlleAlaSerPxoLysGlyProVallleGlnMet 
3241 GTCTACCACGGGGCCGGAACGAGGACCATCGCGTCAGCCAAGGGTCCTGTCATCCAGATG 
CAGATGGTGCCCCGGCCTTGCTCCTGGTAGCGCAGTGGGTTCCCAGGACAGTAGGTCTAC 

Sor Thr 

TyrThrAsnValAspGlnAspLeuValGlyTrpProAlaProGlnGlySerArgSerLeu 
3301 TATACCAATGTAGACCAAGACCTTGXGGGCTGGCCCGCTCCGCAAGGTAGCCGCTCATTG 
ATATGGTTACATCTGGTTCTGGAACACCCGACCGGGCGAGGCGTTCCATCGGCGAGTAAC 

TlirProCysThrCysGlySsrSerAspLeuTyrLeuValThrArgHlsAlaAspVallle 

3361 ACACCCTGCACTTGCGGCTCCTCGGACCTTTACCTGGTCACGAGGCACGCCGATGTCATT 
TGTGGGACGTGAACGCCGAGGAGCCTGGAAATGGACCAGTGCTCCGTGCGGCTACAGTAA 

ProValArgArgArgGlyABpSerArgGlySerLeuLeuSerProAr^ProlleSerTyr 
3421 CCCGTGCGCCGGCGGGGTGATAGCAGGGGCAGCCTGCTGTCGCCCCGGCCCATTTCCTAC 
GGGCACGCGGCCGCCCCACTATCGTCCCCGICGGACGACAGCGGGGCCGGGTAAAGGATG 

LeuLysGlySerSerGlyGlyPToLeuLeuCysProAlaGlyHisAlaValGlyllePhe 



EPO0Q388232 fhttp://wvw.getthepaientxoiTi/Lo«in.dog/$pa 



Pa ge 71 of il 



EP 0 388 232 A1 



LeuGluThrThrMetArgSerProValPheThrAspAsnSerSerProProValValPro 
' ifiol OTA^GA^CCATGAGGTCCCCGGTGTTCACGGAIAACTCCTCTCCACCAGTAGTGCCC 

GATCTCTGTTM 

GltiSarPheGlnValAlaKiaLeuHisAlaProThxGlySefClyLysSerlhrLysVal 
r^A^TT^AGGTGGCTCACCTCCATGCTCCCACAGGCAGCGGCAAAAGCACCAAGGTC 
GTC^GAAGGTCCACCGAGTGGAGGIACGAGGGTGTCCGTCGCCGTTniTCGTGGTTCCAG 

PTOalaAlalVxMaAlaGlnGlyO^LyBVallAuVall^uAsnProSei^alAlaAla 
rS^TCC^ATCCA^TCAGGGCTATMGGTGCTAGTACTCAACCCCTCTGTTGGTGCA 

^CCGAC^TACGTCGAGTCCCGA^ 

Leu 

acactgggctttcg^tScatctSggc 

ThrAaoAlaThrSerlleLeuGlylleGlyThrVall^uJUpGlnnaGluThrAlaGly 
IGCCTACGGTGraGGEAG^ 



3661 

GT 



TGTGA( 

3841 



3901 

CGGC 



3961 
4021 



Fig. 17-5 



4141 



4201 



KMAGCTTCATMGTTCCCCCCCTClGIAGAGTftJ3AAGAC&GTAAGM5CTTCTl!CACG 

^aactccacagg^ 

Tyr 

MetThxGlTTTxThrGlyAspPheAspSerVallleAflpCyaAanThr^^alll^^ 

Ser 

4381 SwoacKiKeic^^ 

4441 SSSkm^^ 

vMetPheAsDSerSexValLeuCyeGluCys 
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TyrAspAlaGlyCyaAlalrpTyrGlulAuT^ 

4561 TATGACGCAGGCTGKCTTGGTATGAGCI^CGCCCGCCGAGACTACAGTTAGGCTACGA 
ATACTGCGTCCGACACGAACCATACTCGAGTGCGGGCGGCTCTGATGTCAATCCGATGCT 

AlaTvrMetAflnThxProGlyLeuProValCysGlnAspHisLeuGlriPheTrpGluGly 
4 621 GCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGACCATCTTGAATTTTGGGAGGGC 
CGCATGTACTTGTGGGGCCCCGAAGGGCACACGGTCCTGGTAGAACTlAAAACCCTCCCG 

ValPheThxGlyLeuThrHialleAspAlaHiaPheLeuSerGlJiThrLysGlnSerGly 
4681 GTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTATCCCAGACAAAGCAGAGTGGG 
CAGAAATGTCCGGAGTGAGTATATCTACGGGTGAAAGATAGGGTCTG5TTCGTCTCACCC 

GluAsnLeuProTyrLeuValAlaryrGli^aKttValCysAl^^ 
4741 GAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTGTGCGCIAGGGCTCAAGCCCCT 
CTCTTGGAAGGAATGGACCATCGCATGGTTCGGTGGCACACGCGATCCCGAGTTCGGGGA 

ProProSerTrpAspGl!^t!PrpLyaCy8LeuIleArglAULyeProThrLeuHlsGly 
4801 CCCCCATCGTGGGACCAGAtTGTGGAAGTGTTTGATTCGCCTCAAGCCCACCCTCCATGGG 
GGGGGTAGCACCCTGGKTACACCTTCACAAACTAAGCGGAGT1ECGGGTGGGAGGIACCC 

ProThrProi^ul^uTyrArgLeuGlyAlaValGli^ttGluIleThrLeurhrHisPro 
4861 ccaacacccctgctatacagactgggcgctgttcagaatgaaatcaccctgacgcaccca 
gg tt<3tggg g acgatatgtctg acccgcgacaagtcttacttt agtg ggactgc gtgggt 

ValThrLysO?yrlleMetThrCyfitotSerAlaAapLeuGluValValThrS«:ThrTrp 
4 9 21 gtcaccaaatacatcatgacatgcatgtcggccgacctggaggtcgtcacgagcacctgg 
cagtggtttatgtagtactgtacgtacagccggctggacctccagcagtgctcgtggacc 



ValLeuValGlyGlyValLeuAlaAlaI«uAlaAlaTyrCysl^uSerThxGlyCy3Val 
4981 GTGCTCGTTGGCGGCGTCCTGGCTGCTTTGGCCGCGTATTGCCTGTCAACAGGCTGCGTG 
CACGAGCJACCGCCGCAGGACCGACGAAACCGGCGCATAACGGACAGTTGTCCGACGCAC 

ValllevalGlyArgValValLeuSerGlyLysPraAlallelleProAspArgGluVal 
5041 GTCATAGTGGGCAGGGTCGTCTTGTCCGGGAAGCCGGCAATCAIACCTGACAGGGAAGTC 
CAGTATCACCCGTCCCAGCAGAACAGGCCCTTCGGCCGMAGTATCGACTGTCCCTTC^ 

I^uTyrArgGluPheAspGlUMetGlUGluCysSerGlnHisLeuPro'TyrlleGluGln 
5101 CTCTACCGAGAGTTCGATGAGATGGAAGAGTGCTCTCAGCACOTACCGTACATCGAGCAA 
GAGATGGCTCTCAAGCTACTCTACCTlCTCACGAGAGTCGTGAATGGCATGTAGCTCGTr 

GlyMetMetLAuMaGluGlnPheLysGlnLysAlalAuGlyl^uLeuGlnThrAlaSer 
5 161 GGGATGATGCTCGCCGAGCAGTTCAAGCAGAAGGCCCTCGGCCTCCTGCAGACCGCGTCC 
CCCTACTACGAGCGGCTCGTCAAGTTCGTCTTCCGGGAGCCGGAGGACGTCTGGCGCAGG 

ArgGlnAlaGluValllfiAlaProAlaValGlnlhrAanTrpGlnLyflLeuGluThrPhe 
5 221 CGTCAGGCAGAGGTTATCGCCCCTGCTGTCCAGACCAACTGGCAAAAACICGAGACCTTC 
GCAGTCCGTCTCCAATAGCGGGGACGACAGGTCTGGTTGACraTTTTTGAGCTCTGGAAG 

TrpAlaiysHisMetTrpAsnPhelleSerGlylleGlnTyrLeuAlaGlyLeuSerThr 
5281 TGGGCG AAG CATATG TGGAACT TCATCAGTGGGATACAATACTTGG CGGG CTTGTCAACG 
ACCCGCTTCGTATACACCTTGAAGTAGTCACCCTA^TTATGAACCGCCCGAACAGTTGC 

LeuProGlyAenProAlalleAlaSerLeuMetAlaPheThiAlaJQaValThrSerPro 
5341 CTGCCTGGTAACCCC^CSATTGCTTCATTGATGGCTTTIIACAGCTGCTGTCACCAGCCCA 
GACGGACCATTGGGGCGGTAACGAAGTAACXACCGAAAATGTCGACGACAGTGGTCGGGT 

LeuThrThrSerGlnThrl^uLeuPhaAsnlleLeuGlyGlyTrpValAlaAlaGlnLeu 
540X CTAACCACTAGCCAAACCCTCCTCTTCAACATATTGGGGGGGTGGGTGGCTGCCCAGCTC 

gattggtgatcggtttgggaggagaagttgtataacccccccacccaccgacgggtcgag 
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' 5521 AGTGTTCGACTGGGGAAGGTCCTCAIA^C 
«aCAMCTGACCrc^ 

Gly 

GlyAlaLeuValAla^heLyslleMeW 
,581 GGAGCTCTTGTCGCATTCMGATCATGAGCGGTGAGGTCCCCTCCACGGAGGACCTGGTG 
CCTCGAGAACACCGTAAGTTCTAGTACTCGCCACTCCAGGGGAGGXGCCTCCTGGACCAG 

AsnLeuLeuProAlalleLeuSerProGlyAl^ 
5641 AATCTACTGCCCGCCATCCTCTCGCCCGGAGCCCTCGTAGTCGGCGTGGTCTGTGCAGCA 
ITAGATGACGGGCGGTAGGAGAGCGGGCCTCGGGAGCATCAGCCGCACCAGACACGTCGT 

IleteuArgArgHisValGlyProGlyGluGl^^ 
5701 ATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTGCAG 

TATGACGCGGCCGTGCAACCGGGCCCGCTCCCCCGTCACGTCACCTACTTGGCCGACTAT 

MaPheAlaSerArgGlyAsnHisValSerPro^ 
5761 GCCTXCGCCTCCCGGGGGAACCATGa?TTCCCCCACGCACTACG!TGCCGGAGAGCGATGCA 
CGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTGATGCACGGCCTCTCGCTACGT 

HisCys 

AlaAlaArgValThrAlalleLeuSarSerl^uTta^ 
5821 GCTGCCCGCGTCACTGCCATACTCAGCAGCCTCACTGTAACCCAGCTCCTGAGGCGACIEG 
CGACGGGCGGAGTGACGGTATGAGTCGTCGGAGTGACATTGGGTCGAGGACTCCGCTGAC 

HisGlnTrpIleSerSerGluCysThrThrProCysSarGlySerlrpLeuArgAsplle 
5881 CACG^GTGGATAAGCTCGGAGTGTACCACTCCATGCayCCGGMCCTGGCTAAGGGACATC 
GTGGTCACCSAOTCGMCCIC^ 

TrpAsp!Trp21eCysGluValX.auSarAspPhelysThrTrpLeuLysAlaLysLeuMet 
5941 TGGGACTGGATATGCGAGGTGTTGAGCGACTTTAAGACCTGGCIAAAAGCTAAGCTCATG _ 
ACCCTGACCTATACGCTCCACMCICGCTGAAATTCTGGACCGATTTTCGATTCGAGTAC ?i 9 • 17-7 

PrcGlnLeuProGlylleProPheValSarCysGlnArgGlyTyrLysGlyV^ 
6001 CCACAGCXGCCTGGGATCCCCTTTGTGTCCMCCAGCGCGGGTATAAGGGGGTCTGGCGA 
GGTGICGACGGACCCTAGGGGAAACACAGGACGGTCGCGCCCATATTCCCCCAGACCGCT 



ValAspGlylleMetHisThrArgCysHiaCysGlyAlaGluIleThrGlyHisValLys 
6061 GTGGACGGCATCATGCACACTCGCTGCCACrGTCGAGCTGAQA!I^CTGGACATGTCAAA 
CACCTGCCGTAGTACGTGTGAGCGACGGTGACACCdCCGACTCTAGTGACCTGTACAGTTT 

AsnGlyXhrKetArglleValGlyPrc^ 
6121 AACGGGACGATGAGGATCGTCGGTCCTAGGACCTGCAGGAACATGTGGAGIGGGACCTTC 
TTGCCCTGCOIACTCCTAGC&GC^ 

PralleAsnAlaTyrThrThrGlyProCysThrf 
6181 CCCATTAAW3CCTACACCa<^^ 

GGGrAATTACflGATGTGGIGCCCGGGGAOVTGGGGGGAA^ 

AlaLeulrpA2^ValSerAlaGluGluTyrValGluIleArgGlnValGlyAspPheHiB 
6241 GCGC$ATGGAGGGTGTC!TGCAGAGGMIAKTG 

CGCG^TACCTCCCACAGACGTCTCCTTATACACCTCTATTCCGTCCACCCCCTC 

oyrValThxGlyMetThrThr^^ 
6301 TACGTGACGGGTATGACTACTGACAATCTCAAAT?GCCCG!I^CAGG!rCCCATCGCCCGAA 
ATGCACTGCCCATACTGATGACTGTIAGAGTraAOGGGCACGGTC 

PhePheThrGluLeuAspGlyValArgLeuHia^ 
6361 TTTTTCACAGAATTGGACGGGGTGCGCCTACATAGGTTTGCGCCCCCCTGCAAGCCCITG 
AAAAAGTGTCTTAACCTGCCCCACGCGGA^ 

LeuArgGluGluValSerPheArgValGlyLauHisGluTyrProValGlySerGlnLeu 
6421 GXGCGGGAGGAGGTATCATTCAGAGTAGGACTCCACGAATACCCGGTAGGGTCGCAATTA 
.~ — ^^^^m^^^m^nm^^^^rT^A^rr.TGAGGTGCTTATGGGCCATCCGAGCGTTAAT 



EP000388232fhttp:/A-AW^etthepatenU-om^ 



Page 74 of 11 



EP 0 388 232 A1 



ProCysGluPrcCluPrcAspValAlaValLeuThrSajSietLeuThrAspProSerHla 
6*81 CCTTGCGAGCCCGAACCGGACGTGGCCGTGlTCACGTCCATGCTCACTGATCCCTCCCAT 
GGMCGCTCGGGCTTGGCCTGCACCGGCACMCTGCAGGTACGAGTGACTAGGGAGGGTA 

IleThrAlaGluAlaAlaGlyArgArgLeuAlaArgGlySexProProSerValAlaSer 
541 ATAACAGCAGAGGCGGCCGGGCGAAGGTTGGCGAGGGGATCACCCCCCTCTGTGGCCAGC 
TATTGTCGTCTCCGCCGGCCCGCTTCCAACCGCTCCCCTAGTGGGGGGAGACACCGGTCG 

SerSexAlaaarGlnLeuSerAlaProSerLeuLysAlaThrCysThrAlaAsnHisAsp 
6601 TGCTCGGGTAGCCAGCTAICCGCTCCATCTCTCAAGGCAACTTGCACCGCTAACCATGAC 
AGGAGCCGATCGGTCGAIRGGCGAGGIAGAGAGTTCCGTTGAACGTGGCGATTGGTACTG 

SerProAspAlaGluLeuIlaGluMaAsnLeuLQUTrpArgGlnGluMetGlyGlyAen 
6661 TCCCCTGATGCTGAGCTCATAGAGGCCAACCTCCTATGGAGGCAGGAGATGGGCGGCAAC 
AGGGGACTACGACTCGAGTATCTCCGGTTGGAGGATACCTCCGTCCTCTACCCGCCGTTG 

ileThrArgValGluSerGluAsnLysValVallleLQuAspSerPheAspProLeuVal 
6721 ATCACCAGGGTTGAGTCAGAAAACAAAGTGGTGATTCTGGACTCCTTCGATCCGCTTGTG 
TAGTGGTCCCAACTCAGTCTTMGTTTCACCACTAAGACCTGAGGAAGC'EAGGCGAACAC 

AlaGluGluAspGluArgGluileSeiValProAlaGlulleLeuArgLysSerArgArg 
67 ai GCGGAGGAGGACGAGCGGGAGATCTCCGTACCCGCAGAAATCCTGCGGAAGICTCGGAGA 
CGCCTCCTCCTGCTCGCCCTCIAGAGGCATGGGCGTCITTAGGACGCCTTCAGAGCCTCT 

PheAlaGlnAlaLeuProVal3?rpAlaArgProAapTyrAanProProLeuV2aGluThr 
6841 TTCGCCCAGGCCCTGCCCGTTTGGGCGCGGCCGGACTATAACCCCCCGCTAGTGGAGACG 
AAGCGGGTCCGGGACGGGCAAACCCGCGCCGGCCTGATATIGGGGGGCGATCACCTCTGC 

TrpLysLysProAspTya^luProProValValHisGlyCysProLeuProProProLys 
6901 TGGAAAAAGCCCGACTACGAACCACCTGTGGTCCATGGCTGTCCGCTTCCACCICCAAAG 
ACCXTTTTCGGGCTGATGCTTGGTGGACACCAGGTACCGACAGGCGAAGGTGGAGGTTTG 

SerProProValProProProArgLysLysArgKuValValLeuThrGluSerThrLeu 
6961 TCCCCTCCTGTGCCTCCGCCrCGGAAGAAGCGGACGGTGGTCCTCACTGAATCAACCClA 
AGGGGAGGACACGGAGGCGGAGCCTTCTTCGCCTGCCACCAGGAGTGACTTAGTTGGGAT 

Pig. 17-8 

Sex 

SerThrAliilieuAlaGluLeuAlaThrArgSerPheGlySerSerSerThrSarGlylle 
7021 TCTACTGCCTTGGCCGAGCTCGCCACCAGAAGCTTTGGCAGCTCCTCAACTTCCGGCATT 
AGATGACGGAACCGGCTCGAGCGGTGGTCTTCGAAACCGTCGAGGAGITGMGGCCGTAA 

ThrGiyAspAanThrThrThrSerSerGluProAlaProSerGlyCysProProAspSer 
7081 ACGGGCGACAATACGACAACATCCTGIGAGCCCGCCCC1TCTGGCTGCCCCCCCGACTCC 
TGCCCGCTGTTATGCTGTTGSAGGAGACrCGGGCGGGGAAGACCGACGGGGGGGCTGAGG 

PheAla 

AspAlaGluSerlyrSerSeiiietProProLeuGluGlyGluProGlyAspProAsi^eu 
7141 GACGCTGAGTCCTATTCCTCCATGCCCCCCCTGGAGGGGGAGCCTGGGGATGCGGATCIT 
CTGCGACTCAGGATAAGGAGGTACGGGGGGGACCTCCCCCTCGGACCCCTAGGCCTAGAA 

SerAspGlySexTrpSerThrValSerSerGluAlaAflnAlaGluAfipValValCysCys 
7201 AGCGACGGGTCATGGTCAACGGTCAGTAGTGAGGCCAACGCGGAGGATGTCGTGTGCTGC 
TCGCIGCCCAGlACCAGITCCCAGl^TCACTCCGG^ 

Ser^tScrTyrSerTrpThrGlyAlaLeuValThrProCyaAlaAlaflluGluGlnLys 
7261 TCAATGTCTTACTCTTGGACAGGCGCACTCGTCACCCCGTGCGCCGCGGAAGAACAGAAA 
AGO^CAGAATGAGAACCTGTCCGCGTGAGCAGTGGGGCACGCGGCGCCTTCTTGTCXTT 

LeuProIleABaal^uSerAsnSerl^uI^uArgHiaUisAsnLeuValTyrSerTto 
7321 CTGCCCATCAATGCACTAAGCAACTCGTTGCTACGTGACCACAATTTGGTGTATTCCACC 
GACGGGTAGTTACGTGATTCGTTGAGCAACGATGCAGTGGXGTTAAACCACA1AAGGTGG 

ThrSerArgsexAlaCysGloArgGlnLysLyaValThrPheAspArgLeuGlnValLeu 
7i«i arrTrACrAGTGCTTGCCAAAGGCAGAAGAAAGTCACATTTGACAGACTGCMGTTCTC 
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AepSerHisTyrGlnAspValLfiuLysGluValLysAlaAlaAlaSerLysValLysAla 
7441 GACAGCCATTACCAGGACGTACTCAAGGAGGTIAAA^ 

CTGTCGGTAATGGTCCTGCATGAGTTCCTCCRATTTCGTCGCCGCAGTTTTCACTTCCGA 

Phe 

AsnLauLeuSerValGluGluAlaCysSerLeuThrProProHiaSerAlaLysSerLys 
7501 AACTOSSCTATCCGTAGAGGAAGCTTGCAGCCTGACGCCCCCACACTCAGCCAAATCCAAG 
OTBAACGAIAGGCaTCTCCTTCGAACG.TCGGACTGCGGGGGTGTGAGTCGGTCT 

PheGlyTyrGlyAlaLy^spValArgCyaHisAlaArgLysAlaValThrHlaileAsn 
7561 TTTGGTXATGGGGCAAAAGACGTCCGTTGCCATGCCAGAAAGGCCGTAACCCACATCAAC 
AAACCAATACCCCGTTTTCTGCAGGCAACGGTACGGIC1TTCC5GCATTGGGTGTAG1TG 

SerValTrpLysAspl^ul^uGluAspAanValrlirProIleAapThrThrlletotAla 
7621 TCCGTGTGGAAAGACCTTCTGGAAGACAATGTAACACCAATAGACACTACCATCATGGCT 
AGGCACACCTTTCTGGAAGACCTTCTGTTACATTGTGGTIATCTGTGATGGTAGTACCGA 

LvsAsnGluValPheCysValGlaPrcsGluLyaGlyGlyArgLyaProAlaArgLeuIle 
7 fi a 1 AAGAACGAGGTTTTCTGCGTTCAGCCTGAGAAGGGGGGTCGTAAGCCAGCTCGTCTCATC 
TTCTXGCTCCAAAAGACGCAAGTCGGACTCTTCCCCCCAGCATTCGGTCGAGCAGAGTAG 

ValPheProAsplieuGlyVjOAigValCysGluLysMfttAlaXeuTyrAspValValThr 
7741 GTGTTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAGATGGCTTTGTACGACGTGGTTACA 
CACAAGGGGCTAGACCCGGAOSCGCACACGCTirTCTACCGAAACATGCTGCACCAATGT 

LysLeuProLeuAlaValMetGlySerSerTyrGlyPheGlnTyrSerPrcxslyGlnArg 
7801 AAGCTCCCCTTGGCCGTGATGGGAAGCTCCTACGGATTCCAATACTCACCAGGACAGCGG 
TTCGAGGGGAACCGGCACTACCCTTCGAGGATGCCTAAGGTTATGAGTGGTCCTGTCGCC 

valGluPheLeuValGinAlaTrpLysSerLysLyfiThrPxoMetGlyPheSerryrAsp 
7861 gttgaattcctcgtgcaagcgtggaagtccaagaaaaccccaatggggttctcgtatgat 
caacttaaggagcacgttcgcaccttcaggttctottggggttaccccaagagca^cta 

ThrArqCysPheAspSerThrValThrGluSerAspIleArgThrGluGluAlalleTyr 
7921 ACCCGCTGCTTTGACTCCACAGTCACIGAGAGCGAGATCCGTACGGAGGAGGCAAICTAC 
TGGGCGACGAAACTGAGGTGTCAGTGACTCTCGCTGTAGGCATGCCTCCTCCGITTAGATG 

GlnCysCysAspLauAspProGlnAlaArgValAlalleLysSerLeuThrGluArgr^u 
7981 CAAiGTTGTGACCTCGACCCCCAAGCCCGCGTGGCCATCAAGTCCCTCACCGAGAGGCTT 
GTTACAACACTGGAGCTGGGGGTfCGGGCGCACCGGTAGTTCAGGGAGTGGCTCTCCGAA 

Gly 

TvrValGlyGlyProLeuThrAsnSexArgGlyGluAsnCyaGlyTyrAxgArgCysArg 
8041 tatgttgggggccctcttaccaaxtcaaggggggagaactgcggctatcgcaggtgccgg 
atacaacccccgggagaatggttaagttcccccctcitgacgccgatagcgtccacggcg 

AlaSerGlyValLeuTtaTtaSerCysGlyAan^^ 
filOl GCGAGCGGCGTACTGACAACTAGCTGTGGTAACACCCTCACTTGCTACATCAAGGCCCGG 
CGCTCGCCGCATGACTGTTGATCGACACCATTGTGGGAGTGAACGATGIAGTTCCGGGCC 

AlaAlaCysArgAlaAlaGlyLeuGlnAspCysThrftetLeuValCyaGlyAapAspLeu 
8 161 GCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATGCTCGTGTGTGGCGACGACTTA 
CGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTACGAGCACACACCGCTGCTGAAT 

ValVallleCysGluSerAlaGlyValGlnGluAspAlaAlaSerl^uAxgAlaPhe'rhr 

8221 SSSSSSSSS^SSSSSSSSSSSS^Si 

828i assBssssBss^^ 

C^CGATACTGGTCCATGAGGCGGGGGGGACCCCTGGGGGGTGTTGGTCTTATGCTGAAC 

GluLeuIleThrSerC^aSerSerAsnValSerValAlaHisAspGlyAlaGlyLyaArg 
8341 GAGCTCATAACATCATGCTCCTCCAACGTGTCAGTC 
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ValTyrTyxI^UThrArgAspProThr?hrProte^ 
8401 GTCTACTACCTCACCCGTGACCCTACAACCCCCCTaSCGAGMCTGCGTGGGAGACAGCA 
CAGATG ATGGAGTGGGCAC TGGGATG TTGGGGGGAGCGCTCTCG ACG CACCCTCTGTCGT 

ArgHi3ThrProVaiAsnSerTrpLeuGlyAsnIlfeileMetPheAlaProThrLeuTrp 
8461 AGACACACTCCAGTCAATTCCrGGCTAGGCMCATAArCATGTl^CCCCCACACTGTGG 
TCTGTGTGAGGTCAGTTAAGGACCGATCCGTTGTACTAGTACAAACGGGGGTGTGACACC 

AlaArgMetlleLeuMetThrHisPhePheSerValLeuIleAlaArgAspGlnLeuGlu 
8521 GCGAGGATGATACTGATGACCCATTTCTTTAGCGTCCTTATAGCCAGGGACCAGCTTGAA 
CGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAATATCGGTCCCTGGTCGAACTT 

Gli^al^uAspCyaGluIleTyxGlyAlaCyaTyrSerlleGluProLeuAapLeuPro 
8581 CAGGGCCTCGATTGCGAGATCTACGGGGCCTGCTACTCCATAGAACCACTTGATCTACCT 
GTCC GG G AGCTAACGCTCTAGATG CCCCGGACGATGAGGTATC TTGG TGAACTAGAilGGA 



ProlleileGlnArgLeuHIsGlyLeuSerAlaPheSerLeuHisSerTyrSerProGly 
8641 CCAAICATTG^GACTCCAOSGCCICAGCGC^^ 

GGTTAGTMGTTICTGAGGTACCGGAGTCGCGrAAAAGTGAGGTGTCAATGAGAGGTCCA 

GluIleAfinArgValAlaAlaCysLeuArgLysLeuGiyValProProLeiiArgAlaTrp 
8701 GAAATTAATAGGGTGGCCGCATGCCTCAGAAAACTTGGGGTACCGCCCrTCCGAGCTTGG 
CTTTAATTATCCCACCGGCGTACGGAGTCTTTTGAACCCCATGGCGGGAACGCTCGAACC 

Gly 

ArgHisArgAlaArgSexValArgAlaArgLeuLeuAlaArgGlyGlyArgAlaAlalle 
8761 AGACACCGGGCCCGGAGCGTCCGCGCTAGGCTTCTGGCCAGAGGAGGCAGGGCTGCCATA 

TCTGTGGCCCGGGCCTCGCAGGCGCGATCCGAAGACCGGTCTCCTCCGTCCCGACGGTAT ~ 

CysGlyLysTyrLeuPheAsnTrpAlaValArgThrLysLeuLys 
8821 TGTGGCAAGTACCTCMCAACTGGGCAGTAAGAACAAAGCTCAAAC 

ACACCGTTCATGGAGAAGTTGACCCGTCATTCTTGTTTCGAGTTTG pig. 17-10 
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